Archive

NLP

I’m trying out a documentation schedule for a new project I’m starting called NewsMelon. It will be a combination of information retrieval and text summarization. That is, NewsMelon will find multiple news articles about a story and provide a summary of that information. The goal will be to build an MVP within the next ¬†two weeks, using a Node based webserver with Flask for the NLP and React/Redux for the front-end. I haven’t built anything major with Node.js yet, so I’m hoping I’ll get acquainted with some useful packages and Express.js.

The tentative tech stack is listed below:

Back-End:

  • Node.js running Express.js as the static file server / CRUD REST API
  • MongoDB as the main DB ( to experiment with a JSON based data storage system )
  • Python Flask as an endpoint to access NLP functions
  • NLP: NLTK/Goose Extractor/Newspaper ( to extract article data from news article URLs )
  • LevelDB? via LevelUP for full-text-search on the Node webserver

Front-End:

  • React: UI
  • Material-UI: Basic Material Design style guide with React components ( for dev speed )
  • Redux: front-end state management ( may not use if the application is simple enough )
  • SASS: for stylin’

+ webpack for building/npm for package management

With an ambitious goal of 2 weeks ( maybe 6 weeks to be realistic ), I’ll try to post major updates for insights I gain and issues I come across every few days.

Advertisements