Get the "Applied Data Science Edge"!

The ViralML School

Fundamental Market Analysis with Python - Find Your Own Answers On What Is Going on in the Financial Markets

Web Work

Python Web Work - Prototyping Guide for Maker

Use HTML5 Templates, Serve Dynamic Content, Build Machine Learning Web Apps, Grow Audiences & Conquer the World!

Hot off the Press!

The Little Book of Fundamental Market Indicators

My New Book: "The Little Book of Fundamental Analysis: Hands-On Market Analysis with Python" is Out!

If you liked it, please share it:


Pulling the Plug on “KeepTheTalk.com” and the Lessons Learned
I announced it’s birth two months ago, now am publishing its obit.

Pulling the Plug on “KeepTheTalk.com” and the Lessons Learned

I announced it’s birth two months ago, now am publishing its obit.

Keep the Talk, one of my side projects, about to be unplugged

This is my warning to all those with limited time and small networking circles, if you’re not passionate about a side project, it’s doubtful you’ll go the extra mile. Once the slope gets steep, the work boring, and the realization that the top of the hill is not much different than the bottom, you'll probably quit. And that’s why I am unplugging “KeepTheTalk.com”.

KeepTheTalk.com

For those of you that don’t know what it is, i.e. 99.999% of you, it is a smart Twitter newsfeed. It goes around various sites and picks up an interesting news story and formats it for Twitter including image, slug, highlight, hashtags, and link. The “highlight” is the special sauce. This machine learning tool will infer what it thinks is an interesting sentence and pull it out. It looks for something different than the title or first line. A bit like a summary sentence but verbatim and with more oomph. The idea is if a reader is in a hurry, that highlight combined with the title and image of the piece may be all he or she needs to find value in the information without clicking on it. 

Case and point, here’s one of them:

I’ll leave the site running for a bit as the server costs are trivial and some of you may want to take a peek at the tool —  https://www.keepthetalk.com

It’s All About the Journey, Not the Destination

The journey was made up of easy technical feats like scraping news sites I thought appropriate and formatting the content for Twitter. A more challenging technical hurdle was hunting for the best “highlight” sentence. Every NLP trick in the book was tried and the good ones deployed — from NLTK, TDFIDF, Word2Vec, GloVe, Bert, etc. The final engine did a great job with very little human ‘vetting’. I was proud of the “highlight” work and the code will undoubtedly be repurposed for other side projects requiring summarizing or text filtering.

Though challenging and ambitious, those weren’t the reasons for pulling the plug. No, instead I blame our current taste and current consumption habits of news. The highly polarized participants and the abundance of fake and hateful messages are depressing as hell. It’s the quality of the raw news material that turned out to be the issue. I love news about tech, state of the economy, space, employment, but invariably, the “other” stuff would creep in and spoil the fun.

It required daily triage. I basically had to chaperon what entered the funnel to determine if the news was something I could stomach. There’s a lot of stuff out there that I don’t want to touch, all the political infighting, name-calling games, manipulation, racist slants, fake garbage (I’m fine about reporting on fake news generation, not the actual end product). Obviously, I proved to myself that I don’t have the objectivity necessary for this type of work nor did I want to subject myself to reading stuff that I find soul-crushing.

Being a moderator for a large social media company has got to be horrible, grinding and depressing — ant that’s the job I created for myself and without pay…

I didn’t want to curate a list of forbidden keywords, like conflicts or political figures, criminals, crimes, etc., and filter all incoming news using if-then statements. It would not only go against the essence of machine learning but would also leave me with the guilty feeling of being just another manipulator of news…

Though I have reported plenty of negative stories on the big players like Facebook and Google censoring or ignoring what goes through their feeds, it certainly made me more aware of the difficulties around those services. It also made me feel an enormous amount of empathy for anybody in the roles of moderating, assisting, monitoring public feeds — how quickly we let our online comments degrade into the nasty, ugly, mean and manipulative.

Lessons Learned

I learned a lot about building SaaS apps, automating them (apparently not enough in this case), and about myself.

The idea behind the MVP Light Stack method (machine learning pipelines that end up as web apps), is to create inexpensive web presences that are as self-sufficient as possible. This ensures that when you are tired, bored, busy, or having doubts, your site still runs. This is important as we are human, and humans are moody and difficult, and we tend to do rash things on emotional whims. By creating autonomous processes to power your site through stretches of fatigue and confusion, you allow both your customers to kee being serviced and your vision to live another day. Keep The Talk only had a day or two of autonomy and would go stale and useless quickly thereafter. It failed to follow the MVP Light Stack tenet of being self-sufficient but succeded in being inexpensive. 

Don’t feel sorry for me, I needed closure to fully pivot. I also have other more robust MVP pipelines humming along to keep me busy and a few ideas for future fun projects. 

If you are interested to extend your ML ideas to the web check out the ViralML school and in particular the “Applied Machine Learning” tract.