my notes ( ? )
We made a fake news detector with above a 95% accuracy (on a validation set) that uses machine learning and Natural Language Processing that you can download here. In the real world, the accuracy might be lower...
we decided to just try and scrape domains that were known fake, real, satire, etc. and see if we could build a data set quickly... The results were crap... domains never fell into neat little categories ... Some of them had fake news mixed with real news, others were just blog posts from other sites... articles where 90% of the text were Trump tweets.... I started the long process of manually reading every single article before deciding what category it fell into... We hit an accuracy of about 70%... wasn’t going to be of any use to anybody...
maybe the answer isn’t detecting fake news, but detecting real news... factual and to the point, ... plenty of reputable sources ... I decided to categorize everything into two labels; real and notreal... Notreal would include satire, opinion pieces, fake news, and everything else that wasn’t written in a purely factual way that also adhered to the AP standards... the accuracy was above 95%... We call it Fakebox, and its really easy to use.
Read the Full Post
The above notes were curated from the full post
towardsdatascience.com/i-trained-fake-news-detection-ai-with-95-accuracy-and-almost-went-crazy-d10589aa57c.