Introduction to Orange

Welcome to Introduction to Orange

Welcome to Lecturus Global’s course on the basics of using Orange. Orange is an Open-Sourced tool for Machine Learning and Data Visualization. Orange has a large and diverse toolbox that the user can use to build data analysis workflows visually. By the end of this you will know how to perform the following:

How to install Orange on your device.
How to navigate the User Interface (UI) of Orange.
How to select and upload data with Orange.
How to connect to different resources – such as tables or a graph – with Orange.
How to analyze given data with Orange.

With that said, we shall move onto how to install Orange.

Orange - Installation

To begin with installation, first you must visit the website for Orange, specifically the “download” page. Here is a link to said page:

https://orangedatamining.com/download/#windows

From here, click the icon/button for whatever OS you use. Because this tutorial was written by an individual using a Windows machine, we selected windows. Regardless, make sure you download the latest release of Orange for your system.

After that, once the file is finished downloading, click it and open the wizard. Simply click “Yes”, “I agree”, or “Next” to anything as the installation process itself is very simple and does not require anything advanced. Once you reach the final page of the wizard, you will be given the option to start Orange as soon as you finish the installation. Make sure it is checked/selected, and then finish installation.

Now you have successfully installed Orange. In the next section, we will go over the UI.

Orange - The User Interface

Upon opening Orange, you will be greeted with a window

Before you can start to use orange, you need to get past that window. Let’s discuss each of the buttons, starting with the green ones on the top.

New – This button opens a new Orange Workflow, clicking this will close the window and allow you to use the screen behind it.
Open – This button will take you to a file explorer where you can find and open old Orange Workflows.
Recent – Like the “Open” button, this will open an already existing Orange Workflow, however it will be one that you have opened recently.

Now let us go over the orange buttons on the bottom.

Video Tutorials – This button will take you to a tutorial playlist for the Orange software.
Get Started – This button will to the Orange website. This is the same website where you downloaded orange from.
Examples – This button will take you to another window where you can view and interact with several pre-built workflows that will demonstrate how various functions of Orange works.
Documentation – This button will take you to a section of the Orange website with information on the numerous widgets and nodes it supports.

With that out of the way, let us continue by clicking “new”, and then we will continue to go over the UI of Orange.

On the leftmost portion of the screen, you will see a box containing 5 categories of items: Data, Visualize, Model, Evaluate and Unsupervised. Let us go over the categories.

The data tab contains widgets used for data manipulation. The most basic – and arguably most important – widget here is the File widget.

The visualize tab contains widgets used for visualizing data. Once you connect your data file to one of the displayed plots, you can see your data in a graphic form. Please note that not every node here will be compatible with the data in your file. Choose your visualize node based on practicality.

The model tab contains widgets used for prediction. If you have preexisting data but want to know where the data may lead, consider using widgets from this section.

The evaluate tab contains widgets used for evaluating classification, or regression performance. If you wish your data to see if there is a trend or pattern, consider using widgets from here.

With that said, let us move onto the next section.

Orange - Selecting and Uploading Data

Selecting and uploading Data in Orange is a very simple task. To start, drag a File node from the data tab discussed in the previous section onto the blank workflow space.

Next, double click the File node now that it is in the workflow space.

Now let us go over what is seen here. First, within the source sub-window, this is where you upload or select your data. To upload data, choose either one of the prebuilt data sets – such as zoo.tab in the current picture – or by clicking the file icon next to the file selection area. Doing this will open a file explorer, from which you can find your own data to upload. Alternatively, if you have data from a website, past the website’s URL and upload/select data that way.

In the second sub-window, information about the dataset is shown. Because we are currently using the zoo.tab premade dataset, it has a description set already.

In the third sub-window, we have columns. From here, we can see various rows and columns that define the data. Double clicking any of elements here will allow you to change it. For example, double clicking “categorical” will allow you to change it something else, like “numeric”. Once you are done with these changes, click “apply”, or if you do not want to go through with these changes, click “reset”. For the sake of continuity, we will use the zoo.tab premade dataset, with no modifications. Once you have selected your data and made changes (or no changes at all), simply close the window. We will now move onto the next section.

Orange - Connecting Resources

Like selecting and uploading data, connecting to resources is a simple task. For the sake of education and simplicity, we shall connect a file node to a distribution plot node. Open the “Visualize” tab on the box on the leftmost portion of the Orange workspace. From there, select the “Distributions” widget. In the same way you did for the file widget, click it and drag it onto the workspace.

Now once that is done, click the little dotted arc to right of the file widget. Click and hold, then drag. You should see a line pop out. With this line, drag it to the dotted arc to the left of the distributions widget.

You have just connected resources in Orange; it is that easy. But let us analyze the connection further by double clicking the line in between the two.

By analyzing the line, we can verify that data is coming to the distributions node from the file node. Click “OK” to close this window.

We will now move onto the next and final section of this guide: Analyzing the data and results.

Orange - Analysis and Results

In this section, we will go over how to analyze data with Orange. We will use the same workspace as we did in the last section. To begin analyzing data, double click the distributions node.

Let us dissect this window. Let us start with the big screen to the right of all the other objects within the window. This is where the graphs for each category will be shown. As we can see, it is a distribution graph for the types of animals at the zoo; mammals make up most of the animals at this zoo. Now let us return our attention to the left side of the window.

In the variable sub-window, we can see all the different kinds of variables between the animals in the zoo. In this window we will see all the different variables for any kind of data. For this specific scenario, we can distinguish animal by type, if it has hair, etc.

In the Distribution sub-window, we can control various features such as fitted distribution – which can display lines that can help indicate distribution trends, bin width which can increase or decrease the width of the bars on the graph, while smoothing is another tool that can be used. Additionally, you may want to hide bars as well.

Lastly, we have the columns sub-window. Here, you can choose to split the columns by a variable, which means you can throw in another variable to modify the graphs with. For example, let us say you want to know if any of the milk producing animals are also a predator. You would select “predator” in the variable sub-window, and then split it by “milk” in the columns sub-window. Additionally, you can stack columns to get better comparisons. For example, you may want to know how many animals – regardless of other factors – have hair or do not hair. You would select “hair” in the variable sub-window, and then check off “Stack columns”. You can also check the probability of something in this same window. For example, you may want to check the probability that a mammal has hair or not. You would keep “hair” as the variable, and then check off “Show probabilities”. Lastly, if you are working with numeric data, you can check the cumulative distribution of the data.

Once you are done with the manipulations, click “apply” to save your changes. As far as results are concerned, you have now successfully created a basic data model. You dragged a file node, uploaded it with data, connected with to a distribution node, and viewed the data. Congratulations!

Conclusion

We have reached end of Lecturus Global’s course on the basics of using Orange. It is our hope that you learned a great deal about the basics of Orange from our tutorial. We thank you for your time in reading this and watching the videos of a live demonstration, and we hope that you continue your education and practice with using Orange. Remember: Orange is a very simple – and very powerful – data modeling tool. Thanks for reading!

Made by Aaron Arroyo, Graduated 2021 from New York City College of Technology

Get In Touch

147 Prince St, Brooklyn, NY 11201

Company

Get In Touch

147 Prince St, Brooklyn, NY 11201