Dsaa204 Use Of Decision Tree Answers


  • Internal Code :
  • Subject Code : DSAA204
  • University : Kent Institute
  • Subject Name : IT Computer Science

Data Structure And Algorithms

Contents

Executive Summary:.

Introduction:.

Background:.

Case Study and Design:

Operations and the Justification.

Algorithms and the Justification.

Modifications.

Conclusion:.

References.

Executive Summary of Use of Decision Tree Algorithms in Machine Learning

Variable id is used for doctor, nursing staff and patients as primary key to differentiate between them. While other variables like name, age, contact etc. are used for saving details of doctors, nursing staff and patients. 

Management operations like addition, updating, deletion, searching and displaying are proposed for performing on record of doctors, nursing staff and patients in the data structure. (F. Yuan, 2015) 

For doctor and nursing staff as data size was fixed to 50 and 100 respectively, we selected arrays to save their data for better search while taking advantage of indexing. And another advantage was lesser memory size as compared to other data structures like linked list etc. For patient’s data with non-fixed size AVL binary tree is selected because it is not bound to specific size. Another advantage of selecting it was its self-balancing property that makes it feasible for optimal binary search. We selected binary search algorithm for searching in AVL binary tree because it divides data in half and consider into account only the part which contains the search element resulting in discarding half data. These steps are performed in each iteration until search element is found. Due to half data discarding it saves a lot of time thus most feasible for searching but it is limited to sorted data only. Reason for selecting data structure and algorithm for search is that our operations like add, update, delete are all dependent on search operation so we mainly focused on the search operation and data size as selection factors.

Due to change in the size of data we will have to create new size arrays and copying previous data from old arrays for doctor and nursing staff data while no change is required for patients as AVL binary tree has no limitation of data size. (Chung, 1992)

Introduction to Use of Decision Tree Algorithms in Machine Learning

Data structures are the specific data formats for saving, organizing and managing data. Data structures performs differently in different environments. Every data type is designed for specific environment in which it performs efficiently. Different data management systems have different requirements for saving and managing data depending on the different factors thus requiring different data structure. (F. Yuan, 2015)

While using data structure it is most suitable to pick the data structure that fits better in your environment and performs the task efficiently. Benefits of using most suitable data structure is moderate performance and efficiency. There are specific factors that should be considered while picking a data structure like what type of data it is, how much storage can be allocated for saving data, what operations will be of the highest priority for efficiency etc. Based on these factors most suitable data structures are filtered out and most suitable data structure is selected for use. (A. Navada, 2011)

An algorithm is a specific set of instructions for performing specific task. It can be very simple or can be very complex. Some of the examples are functions for sorting like bubble sort, selection sort etc. They are the general instructions independent of the factor where you will use them unless they are used for performing nonspecific task.

An ideal algorithm for specific task is supposed to be able perform better not only in average case but also in the worst-case events as compared to other algorithms for performing the same task. Selecting most ideal algorithm for performing a specific task improves the performance and the efficiency.

Data structures requires algorithms for saving, organizing and managing data within them. So, data structures and algorithms are linked together. They both contribute in the optimization of the management system and are dependent on each other. (Yokoo, 1993)

Background of Use of Decision Tree Algorithms in Machine Learning

A system can be designed easily for performing specific functions or tasks but designing an optimized system for the same purpose is itself a challenge. In this report we will be designing a hospital management system which perform optimally. Designing such system requires a data structure for storing data that plays a vital role in the performance of system. There are many types of data structures that are used like an array, queue, stack, linked list, binary tree, heap, graph etc. Such data structures should be used wisely for better performance in the system. Similarly, algorithms should also be selected wisely for performing specific operations on data within data structure for better performance in the system. The main focus of the report is to select a specific data structure for storing data related to doctors, nursing staff and patients of the hospital and also selecting a best algorithm for searching doctor, nursing staff or patients from the selected data structure. The reason of selection for data structure and algorithm are also specified in the report. Modification needed for changing requirements is also described in the report. (Y. Liu, 2017)

In this report the case study scenario for which we will be designing our hospital management system has hundred nurses, fifty patients and approximately thousand patients. We have to select appropriate data structures for storing data and optimal algorithms for performing specific tasks or functionalities. These tasks or functionalities are to be defined mainly for searching and managing the data for doctor, nursing staff and patients. We also have to discuss changes in our design for doubled doctors, nursing staff members and patients. (K. O. Thabit, 2017)

Case Study and Design

  • Variables, Ranges and Keys

Doctor:

Variable Name

Range

Type

Key

Details

Doctor Id

1 to 50

Integer

Primary

Unique Identifier

Name

200 max length

String

 

Doctor Name

Specialization

200 max length

String

 

Doctor Specialization

Age

1 to 200

Integer

 

Age in years

Contact

11 digits

Integer

 

Doctor Phone Number

Address

 

String

 

Home Address

Nurse:

Variable Name

Range

Type

Key

Details

Nurse Id

1 to 100

Integer

Primary

Unique Identifier

Name

200 max length

String

 

Nurse Name

Age

1 to 200

Integer

 

Age in years

Contact

11 digits

Integer

 

Nurse Phone Number

Address

 

String

 

Home Address

Patient:

Variable Name

Range

Type

Key

Details

Patient Id

1 min

Integer

Primary

Unique Identifier

Name

200 max length

String

 

Patient Name

Age

1 to 200

Integer

 

Age in years

Contact

11 digits

Integer

 

Patient Phone Number

Address

 

String

 

Home Address

Operations and the Justification

Doctor:

Operation

Justification

Adding a doctor

Saving doctor details for the first time in data structure requires an operation of addition

Updating doctor details

Updating details of existing doctor in data structure requires update operation

Deleting a doctor

Removing doctor from data structure requires delete operation

Search a doctor

Searching details of a specific doctor in the data structure requires search operation

Display all doctors

Displaying list of all doctors in the data structure requires display operation

Nurse:

Operation

Justification

Adding a nurse

Saving nurse details for the first time in data structure requires an operation of addition

Updating nurse details

Updating details of existing nurse in data structure requires update operation

Deleting a nurse

Removing nurse from data structure requires delete operation

Search a nurse

Searching details of a specific nurse in the data structure requires search operation

Display all nurses

Displaying list of all nurses in the data structure requires display operation

Patient:

Operation

Justification

Adding a patient

Saving patient details for the first time in data structure requires an operation of addition

Updating patient details

Updating details of existing patient in data structure requires update operation

Deleting a patient

Removing patient from data structure requires delete operation

Search a patient

Searching details of a specific patient in the data structure requires search operation

Display all patients

Displaying list of all patients in the data structure requires display operation

Algorithms and the Justification

We have to save data for doctors, nursing staff and patients. We will be considering factors data size and operations to perform while selecting algorithms and data structure for each. As all operation for doctor, nursing staff and patients are generally same we will not analyze them separately. Adding, updating and deleting operation are dependent on search operation as they have to first search before performing their task so we will prioritize it as a factor in selection process. We know the exact number of doctor and nursing staff while exact number of patients cannot be determined. So, we have two cases one with fixed data size and vice versa. (K. O. Thabit, 2017)

In case of fixed size like doctor and nursing staff using the arrays is the best option because arrays can be initialized with exact size and gives best data access with the help of indexes. Its memory size is also minimum as compared to evolving data structures like linked list. As each element of array can be accessed with the help of element index so we will not be needing any special algorithm for it. We can search by accessing element at the index one less than id.

While in case of changing size like patients we will have to use a data structure that will evolve with data change. Data structure should also have the ability to support optimal search with the help of an algorithm. There are many data structures like LinkedList, queue, heap, trees etc. which evolves with changing data size. But trees are the best data structures for searching problems. In our case we will be using AVL binary tree for saving patients and binary search algorithm for searching within AVL binary tree. (F. Yuan, 2015)

AVL binary tree is basically a binary tree with the ability to transform itself in a balanced binary tree after a child addition or deletion if its balance is disturbed. It has the following properties:

Sub-trees can have difference of one at max in height

Every sub-tree is itself an AVL tree

Binary search is an algorithm that works on sorted data to search an element from large data by splitting data in two parts in each iteration and choosing the part which will contain that element for next iteration. This process is continued till element is found. (Chung, 1992)

Modifications

As our data size has changed, we will have to update the ranges of id variables for doctor and nursing staff but no update is required for patients as there is no max limit due to non-fixed size. Updated variables tables are shown below:

Doctor:

Variable Name

Range

Type

Key

Details

Doctor Id

1 to 100

Integer

Primary

Unique Identifier

Name

200 max length

String

 

Doctor Name

Specialization

200 max length

String

 

Doctor Specialization

Age

1 to 200

Integer

 

Age in years

Contact

11 digits

Integer

 

Doctor Phone Number

Address

 

String

 

Home Address

Nurse:

Variable Name

Range

Type

Key

Details

Nurse Id

1 to 200

Integer

Primary

Unique Identifier

Name

200 max length

String

 

Nurse Name

Age

1 to 200

Integer

 

Age in years

Contact

11 digits

Integer

 

Nurse Phone Number

Address

 

String

 

Home Address

Patient:

Variable Name

Range

Type

Key

Details

Patient Id

1 min

Integer

Primary

Unique Identifier

Name

200 max length

String

 

Patient Name

Age

1 to 200

Integer

 

Age in years

Contact

11 digits

Integer

 

Patient Phone Number

Address

 

String

 

Home Address

As doctor and nurse staff data was saved in fixed sized array now, we had to create new arrays of doubled size and copy all data from previous arrays to these arrays and adding new data to it. These steps are feasible when data is not very large. But when data becomes very large, we will have to shift to preferably AVL binary tree to avoid copying large data again and again taking a lot of time. (A. Navada, 2011)

Conclusion on Use of Decision Tree Algorithms in Machine Learning

While selecting a data structure or an algorithm we should analyze our choice based on the different factors relevant to our system. We should select our data structure that enhances the performance of algorithm for specific operations on it. For fixed size data it is better to select arrays as they are fast and easily accessible and facilitates most optimal searching with one step with element index if properly used. But when data size changes at certain point we will have to shift from list to other data structures. AVL binary trees are best with binary search algorithm. It can accommodate size changing and balances itself after each addition or updating providing the best sorted data for binary search to perform better even in the worst-case scenario when search element is at leaf node taking only steps equal to the height of tree.

References for Use of Decision Tree Algorithms in Machine Learning

Navada, A. N. A. S. P. a. B. A. S., 2011. Overview of use of decision tree algorithms in machine learning. IEEE Control and System Graduate Research Colloquium, pp. 37-42.

Chung, P. -.. T. a. J. -.., 1992. A new decision-tree classification algorithm for machine learning. Arlington, s.n., pp. 370-377.

Yuan, F. L. X. X. a. Z. J., 2015. Decision tree algorithm optimization research based on MapReduce. Beijing, s.n., pp. 1010-1013.

O. Thabit, H. A. A. a. I. A. A.-A., 2017. From imagining to the making of a novel and fast search methodology: Thabit's algorithm. Jeddah, s.n., pp. 23-30.

Liu, Q. Z. a. P. Y., 2017. NodeLeaper: Lower Overhead Oblivious AVL Tree. IEEE Trustcom/BigDataSE/ICESS, pp. 487-493.

Yokoo, H., 1993. Application of AVL trees to adaptive compression of numerical data. Snowbird, s.n., pp. 310-319.

Remember, at the center of any academic work, lies clarity and evidence. Should you need further assistance, do look up to our Computer Science Assignment Help


Book Online Sessions for Dsaa204 Use Of Decision Tree Answers Online

Submit Your Assignment Here