Deepnews.ai wants to make a decisive contribution to the sustainability of the journalistic information ecosystem by addressing two problems:
1. The lack of correlation between the cost of producing great editorial content and its economic value.
2. The vast untapped potential for news editorial products.
Deepnews.ai willl have a simple and accessible scoring system: the online platform receives a batch of news stories will score on a scale of 1 to 5 based on their journalistic quality. This is done automatically and in real time. This scoring system has multiple applications.
On the business side, the greatest potential is the possibility to adjust the price of an advertisement to the quality of the editorial context. There is room for improvement. Today, a story that required months of work and cost hundreds of thousands of dollars carries the same unitary value (a few dollars per thousand page views) as a short, gossipy article. But times are changing. In the digital ad business, indicators are blinking red: CPMs, click-through rates, and viewability are on a steady downward decline. We believe that inevitably, advertisers and marketers will seek high-quality content--as long as they can rely on a credible indicator of quality. Deepnews.ai will interface with ad servers to assess the value of a story and price and serve ads accordingly. The higher a story's quality score, the pricier the ad space adjacent to it can be. This adjustment will substantially raise the revenue per page to match the quality of news.
On the editorial side: The ability to assess the quality of news will open up opportunities for new products and services such as:
• Recommendation engines improvement: instead of relying on keywords or frequency, Deepnews.ai will surface stories based on substantial quality, which will increase the number of articles read per visit. (Currently, visitors to many news sites read less than two articles per visit).
• Personalization: We believe a reader's profile should not be limited to consumption analytics but should reflect his or her editorial preferences. Deepnews.ai is considering a dedicated "tag" which will be able to connect stories' metadata with a reader's affinity.
• Curation: Publishers will be able to use Deepnews.ai to offer curation services, a business currently left to players like Google and Apple. By providing technology that can automatically surface the best stories from trusted websites (even small ones), Deepnews.ai can help publishers expand their footprint.
The platform will be based on two of ML approaches: a feature-based model and a text content analytic model.
Using traditional ML methods, the first model assesses quality taking as input two sets of "signals" to assess the quality of journalistic work: Quantifiable Signals and Subjective Signals. Quantifiable Signals include the structure and patterns of the HTML page, advertising density, use of visual elements, bylines, word count, readability of the text, information density (number of quotes and named entities). This is processed data from news content. Subjective Signals are human scoring of quality based on criteria such as writing style, thoroughness, balance and fairness, timeliness, etc. These measures are produced by editors and experienced journalists.
The second approach is based on deep learning methods. Here, the goal is to build models that will be able to accurately classify an unseen incoming article purely based on the quality of the report, distinct from the metadata or the topic of discussion. The main challenge in many such deep learning approaches is the availability of labeled data. Nearly four million contemporary articles have been processed. They come from sources deemed as "good" or "commodity" (with no journalistic value-added). For the bulk of our data, the reputation and consistency of the news brand had a significant weight, but the objective is also to classify quality at a finer grained level, detached from the name of the source. To this end, various models are used to capture differences in writing that are agnostic to topical differences.