During de course INFLAB Maarten van den Hoek and I were challenged to explore potential applications of open data made available by the city of Rotterdam. After some experimentation we decided to utilise Python, MongoDB and the lessons we had learned so far in the Data Mining course to develop web-application that calculates the correlation between different types of objects and their attributes in a given area. For instance: what is the correlation between small trees and red playground equipment? Given the right data, this could be expanded with locations of events. For example: what is the correlation between the presence (or absence) of traffic signs and accidents. In some cases, the tool can even be used to check the quality of the different datasets. If, for instance, the type of soil in which a tree is planted is labeled differently from the traffic sign that stands right next to it, at least one of the data points is probably wrong.
Then we took it a step further: we divided the area in smaller sections and calculated correlations for each subsection. Then the deviations from the global correlation were projected on a map. This way both a pattern and the exceptions to that pattern can be examined. You could discover that one particular subsection of an area has a significantly more desirable correlation between two variables, for instance playgrounds and traffic accidents, than it’s surrounding area. This discovery could then lead to an investigation of the subsection, to see what can be learned from it to improve other areas.
For this project, we were asked to develop a web-application that would enable a school to automate the proces of planning their parent-teacher conferences. We were given a set of rules we needed to try a adhere to, like “Parents shouldn’t need to wait more than half an hour in total between different conferences.” And there were, of course, practical limitations we needed to account for, such as the number of available tables, the number of time-slots each evening and the annoying fact that teachers simply can’t be at to tables at once.
The assignment stated that we should use the Google App Engine platform, but we were free to choose the language. Because we already were familiar with Java, and the language Go was still in development, we chose to use Python.
We ended up building a system that allowed parents and guardians to pick what subjects needed to be discussed per child, and to order the available days according to preference as well as indicate if they had a preference to be planned early or late in the evening.
During the first part of this project, I focussed on the ingest of the available data into GAE’s datastore and on generating plausible parent-preferences to work on. The second part of the project I worked on the algorithm that generated the actual planning. The parents were first sorted by the number of conferences they requested. The parents with the most conferences were planned first, as they would be harder and harder to place as the schedule filled up. This would take into account their preference for day and time. All parents were planned with one empty time-slot between them. After a while all requested conferences would be planned, but there would be teachers planned at two or more tables at the same time, as shown in this PDF. The algorithm would then start reordering the planned conferences of each parent, until it found the order that resulted in the lowest number of conflicts. Ultimately, this would result in a planning that adhered to all the practical restrictions and most preferences, as shown in this PDF.
The fourth project was to build our very own mobile application. We were free to choose a platform. We choose for the Android platform for the practical reason that the majority of our group already owned an Android device, which would make testing a lot easier.
After some brainstorming, we stumbled onto the idea of a Color Finder: an application that would take a photo or other image, analyze it’s most dominant colors and assemble a color-palette that would best represent the colors in the original image. The application was predominantly aimed at graphic designers. They would be able to easily be able to create new palette whenever and wherever they saw a scene with a combination of colors they found interesting. The biggest flaw in that business-model was the fact that, though some creative friends had shown interest in the application, most designers own iPhones, not Android phones. Teachers suggested the application could be geared towards a larger audience by enabling the user to use the acquired color-palette to customize his or her phone interface.
During the project I mostly focussed on the algorithm that analyzed what the relevant colors in the image were. This entailed studying what elements of an image a human being would find interesting, determining what attributes of their color set these elements apart and figuring out how to sort all colors in the image to render a relevant color-palette. I ended up building a histogram of hues which showed what hues were most predominant in the image. Then I listed the hues that cause the highest peaks. For each hue I searched for the colors in that hue that had the most saturation and brightness. This is a gross simplification of the complete algorithm, but it give an idea of how it worked. I actually found a few ways of weighing the colors that rendered different palettes, each of which worked better in different circumstances. Unfortunately, there was not enough time to implement multiple versions in the final application.
We were (again) asked to produce a video as well, this time a promotional video. This is what we came up with:
This time we were asked to build a online store for a small computer electronics retailer. We were supplied with flat-file listing of all products and their attributes, and matching images.
We decided to use PHP because some of us had already worked with this language. One point of interest for this project is that we built a database that every product attribute could have sub-attributes, and each sub-attribute could have it’s own sub-attributes. At the same time attributes could have values associated with them.
For the editing of products we opted for a very user friendly Ajax approach: just click the part of the product you want to edit (the name or the description for instance) and that part changes into a text-field in which the text can be changed. Once you leave the field the change is committed to the database.
We also made it possible to nest an infinite amount of categories into each other, and to assign multiple categories to one product.
Unfortunately my web host currently does not support the PostgreSQL version that is needed for the solutions developed for this project. Therefore there is no working version of the end result at the moment.