Continuing the Downloader project part II
2016-10-08 03:09:00 +0000 - Written by Carl Burks
I've been busy. I have been adding to this project a little bit here and there, but the rough alpha version is checked into source control: RedditDL
I explained what I was doing to my wife and as I am a visual person I made some notecard. She rewrote them to make them readable.
The initial flow is something kicks off check_posts.py. This could be your windows scheduler, cron, or running manually. This fires a queue message. If you haven't gotten a queue setup then it won't work. I plan on adding a docker file to the repo later which will do this for you. If you haven't renamed the example.config.yaml, then you didn't read the README.md first and shame on you. After you've supplied the appropriate config values such as:
- reddit - user
- messagequeue - server
- messagequeue - user
- messagequeue - pass
- database - location
- database - engine
- fileStore - location
This has been a fun project to once again code some Python, play with Docker, play with a message queue that isn't from windows, to play with yaml, json, and the reddit api.
Where am I going from here?
I might convert it to Python 3 next before adding features and fixing bugs.
Extracting out the text of the target url for the core content discard the noise and storing it in the database.
Adding a queue task for keyword analysis of the content.
I want to add docker image which might include a web project to show the output files.