Skip main navigation

Deciding housing price using decision tree method

Let’s talk about one real-life example of using the tree method. How can we use tree methods to classify the houses?
We grouped flats into several categories. Which one has more number of rooms? Which one is an apartment and which one is a house?

Let’s talk about one real-life example of using the tree method. How can we use tree methods to classify the houses?

The figure above is the decision tree results for Housing Development Board (HDB) in Singapore. We will go over this application one by one together to extract some meaningful interpretations!

This project was to aid the flat purchase. We have the data regarding flats registered with the Housing Development Board (HDB) in Singapore. Our goal of this analysis is to decide which flat is the best to buy.

We start the flats split by the number of rooms – 2RM/3RM, 4RM, 5RM, and 6RM. In the figure, orange colors indicate the first split. According to this decision tree, your decision should start with how many rooms you would like to have.

But aren’t you curious about how we decided to start from the number of rooms? To build the decision tree algorithm and get the final result, we will try many different ways of splits and find how we finally have the purest leaves in the end! The computer will do all this hard work by doing many trial and error.

Now, let’s look at the second split that the algorithm concluded from many trials and errors. Light green indicates those.

Under the first split of 2RM/3RM, the floor area was used as a second attribute to extend the tree. For instance, it splits the flats belong to the 2RM/3RM category into six groups based on the area of BA < 58.71, up to 58.71. 58.71, 68.96, 79.94, 98.45 and BA > 98.45 respectively. We show you in the below figure.

Turn your attention to the center four boxes colored in light green. It is the second split for the subgroup of 4RM. This second split used model types. Model A, New generation (NG), improved (IMP), and standard (STD) are all model types of the flat.

The last five light green boxes on the right are the second split for the 5RM group. The second split used the flat location in terms of floor. For instance, FLR=2-9 means the flat is between the second and the ninth floor.

We can continue this kind of splits using different attributes and build the entire tree. The table below summarizes the final classification tree, and the attribute used to split in each level. Note that the attributes are all different according to the subgroup.

Let’s try to interpret our results. For the flats with 2RM/3RM, the area less than 58.71 and larger than 58.71 will classify the flats with the monotone increasing prices. Of course, we could further classify the group of flats 2RM/3RM and the area less than 58.71 by using the model type. It will give you the method to classify the price of flats. You want to have 2RM flats, but you only have $120,000. Then you can probably get the flat with less than 58.71 area and STD type.

Source: Fan, G. Z., Ong, S. E., & Koh, H. C. (2006). Determinants of house price: A decision tree approach. Urban Studies, 43(12), 2301-2315.

Please download the file below for more pictures.

© Source: Fan, G. Z., Ong, S. E., & Koh, H. C. (2006). Determinants of house price: A decision tree approach. Urban Studies, 43(12), 2301-2315.
This article is from the free online

Artificial Intelligence and Machine Learning for Business

Created by
FutureLearn - Learning For Life

Reach your personal and professional goals

Unlock access to hundreds of expert online courses and degrees from top universities and educators to gain accredited qualifications and professional CV-building certificates.

Join over 18 million learners to launch, switch or build upon your career, all at your own pace, across a wide range of topic areas.

Start Learning now