February 7, 2018 • 3 min read
Any sufficiently advanced technology is indistinguishable from magic. This is the famous quote by Arthur C. Clarke, and with AI this is probably more true than ever before. AI builds tools and methods which are only partly understood by humans. Take AlphaGo as an example. Only a couple of years ago the mere thought of a computer beating the best humans in the game was laughable. Lee Sedol, the Go prodigy was interviewed and predicted an easy victory over the computer. The ending of this story is of course already known, and we’ve reached a state where the program has invented completely novel ways of playing the game.
As challenging as Go is, especially for myself, I’ve been demolished by even the most rudimentary algorithms for a decade already, it’s still a very restricted problem. The rules are extremely simple, the reward function is easy to construct. This is not to trivialise the achievement in any way, it’s truly and deeply astonishing, but it’s still some way off from a proper technological singularity. It’s probably still a fair judgement to say that algorithms are approaching a state where any given (intellectual) task is more readily solved by a computer than a human. Well, maybe with the exception of Calvinball.
Even as new expert systems like AlphaGo are surfacing at what seems a daily rate, we’re still at a stage where a lot of the meaningful work is done and will continue to be done by humans. AI-software is providing ever improving answers but humans are still the ones asking all the questions. As we learned from The Hitchhiker’s Guide to the Galaxy, even the best answer is useless if we’re asking the wrong questions. (As an aside, the answer apparently was and is more significant than was even intended by the author.) We at Aito, are aiming to do things differently, and focusing on the questions. Every developer knows that one key to high productivity is making the feedback cycles as short as possible (Yes, that obligatory XKCD-reference here).
The usual AI-workflow is not really tuned for this, since the training phase can take ages. A GPU can make a world of difference in the training speed, especially if the information density is really high, as with image or sound processing. For larger dataset, which are often needed, the turnaround time is often measured in days or hours. Granted, this depends on the problem statement and input data, but it’s still a significant cost for the waiting. The problem is accentuated if you end up having to iterate on your algorithm and redo your model. In the worst case you end up with a working model, but asked the wrong question to start with and hence need to restart the whole workflow. Retraining a model to accommodate new data? You guessed it, lots of time for reading XKCD again.
In Aito we’ve built a tool where the model training phase is completely removed. Instead we’re building an index/database and a query language on top of it, which allows running interactive queries against the complete live dataset. Since the model is built on the fly, new data is incorporated in the queries and dataset with a minimal delay. This of course allows using Aito as a traditional lookup index. One can use it by running SQL-like queries on it. The real benefit then comes from the built-in ML-functionality to get predictions, inference, and matching among other things. This without having to separate these functionalities or further exporting and importing the data.
One can of course argue that the problem statement is completely different when processing image data, than when indexing or storing some form of structured data. True, deep learning is not a part of our current offering, but one can ask how many cat picture classifiers or natural language processors does one really need? Utilising deep learning is an exiting exercise, but most people we’ve talked are far less sci-fi in their ambitions. Getting more out of existing data streams is something that almost every company is striving for, and it really feels that deep learning is in most cases a true overkill for the problem. Having tools available, which are conceptually quite familiar also means that taking them into use is easier. This focuses the attention on the core problem, rather than on the implementation coolness, and hence we can deliver meaningful results much quicker.
So, what does “SQL-like queries” actually even mean in this case? It’s something we’re working on like crazy at the moment. Turns out that creating such a query language in a web-friendly format takes quite some work, so I will have to postpone the close description and demo to my next post. Stay tuned.Back to blog list