Machine Learning with Nexosis
Last night I attended the Ann Arbor .Net Users Group and the topic was "Machine Learning for Fun, Finding Bigfoot". The speaker, Guy Royse, did a really good job presenting the subject. The subject matter made it really entertaining and went beyond just Bigfoot data. There were also discussions of UFO's. It was hard not to laugh as Guy was describing how he obtained the data for the discussion, and how the classification system works. The takeaway from the talk was that this was a fun, lighthearted way to play around with machine learning.
The Nexosis tool looked really cool. It starts out free, and gives you full functionality until your volume gets to the point where you have to pay for it. It takes quite a bit of volume to get to the paid level, so you can do a lot of development and learning for free. There is a web UI, REST API, and a number of language specific libraries that you can use to access Nexosis. Guy presented several examples using the web UI and the REST API as well as sample code in Python.
The really cool part of the application is that it does so much of the data science for you. Unlike some of the bigger ML systems, you can upload your data and the application will automatically build your model for you determining the weight of each column, so for a developer who wants to start working with ML, this is a great way to get started.
I think my favorite part of the discussion was the classification of the Bigfoot data. The data was classified by the type of report and they were similar to the close encounters system for alien encounters except instead of using numbers, they use letters. A Class A sighting is where the person reported actually seeing Bigfoot. A Class B sighting is where the person saw evidence of Bigfoot, and my favorite, a Class C sighting is where someone met someone else who saw Bigfoot.
After loading the data in the system, it was really simple to create statistical models based on the imported data. In the presentation Guy showed sightings by month and year as well as a distribution on a map. You could make some pretty educated guesses as to why there were more sightings at one time or another. The summer months had more sightings because there are more people in the woods. The UFO data was even more interesting because the largest number of sightings of UFO's happens to fall on the 4th of July. Obviously our alien visitors love stopping by to watch the fireworks. They probably get a pretty good view from their spaceship.
After setting up the model, Guy also showed how you could pass text of a new sighting to the system, and based on language recognition, the Class of the sighting could be determined. He showed a couple of examples that worked really well, and then one that didn't and explained how the text confused the system.
In the end, we didn't actually find Bigfoot, but we got some pretty good insights into the data. If you get a chance to see Guy Royse talk, I would highly recommend it. He is going to be at Beer City Code, https://beercitycode.com , this weekend, and you can find out more about him on his blog at http://guyroyse.com. If you would like to learn more about Nexosis, their website is: https://www.nexosis.com.
If you are interested in the Ann Arbor Dot Net Users group, their website is https://www.meetup.com/ aadndevs Next months talk is on Defending Against Advanced Web Application Threats.
Do you have any thoughts on Machine Learning, Bigfoot, or UFO's? Let me know in the comments.
Comments
Post a Comment