Commit Graph

32 Commits

Author SHA1 Message Date
vulpine 280bb20c36 change agent 2020-09-15 14:05:36 -04:00
vulpine f79b2e8cff fuzzy purge pages 2020-06-26 16:56:25 +00:00
vulpine 98ddad1e09 fix some bugs with manupulating strings with spaces 2020-06-26 16:42:22 +00:00
vulpine 56e3aab304 more aggressive url filtering 2020-06-25 13:42:07 +00:00
vulpine a6c8f78ec7 move source link 2020-06-23 16:13:58 +00:00
vulpine c58681421e dont re-crawl the same page you dingus 2020-06-23 16:04:44 +00:00
vulpine 112f6a5b19 project rename and moved off github 2020-06-23 15:56:00 +00:00
lickthecheese 8bcf8e76e9 actually sort the results lol 2020-05-05 19:45:43 -04:00
lickthecheese 33635bbad9 people can make their own regex strings lol 2020-03-20 10:55:45 -04:00
lickthecheese 13a3e1d128 fixed cleaning script 2020-03-20 10:23:33 -04:00
lickthecheese 6d7639e693 lol i wasint using uniq correctly 2020-03-19 21:07:11 -04:00
lickthecheese 5425d8a56c um whoops that deleted all my data... 2020-03-19 20:58:13 -04:00
lickthecheese 3ebc3d970f more strictly dont keep non valid data 2020-03-19 20:56:23 -04:00
lickthecheese fb6b06ad7f remove duplicates 2020-03-19 16:35:05 -04:00
lickthecheese 2d759ed8f7 allow the user to not specify a new url while crawling 2020-03-19 16:16:11 -04:00
lickthecheese 285a583492 clean command for when you cancel crawler early 2020-03-19 16:07:34 -04:00
lickthecheese 4da7f1820b obsolete script 2020-03-19 15:59:02 -04:00
lickthecheese d94901f6f2 fix slep 2020-03-19 10:35:34 -04:00
lickthecheese 1f83896371 check urls 2020-03-19 10:25:44 -04:00
lickthecheese 70f0c4d573 timeout 2020-03-19 09:19:29 -04:00
lickthecheese d8715b66a2 wait a little so you dont get rate limited 2020-03-19 08:57:43 -04:00
lickthecheese 06cffd61a2 recursive crawling 2020-03-19 08:50:46 -04:00
lickthecheese a1ffddda7d more agressive filtering of what is an actual site 2020-03-19 08:46:36 -04:00
lickthecheese 063f715700 dont download videos lol 2020-03-19 08:35:22 -04:00
lickthecheese ff13c039e7 crawl websites 2020-03-19 08:19:11 -04:00
lickthecheese 743b72ad22 gitignore 2020-03-19 07:23:19 -04:00
lickthecheese d66e81a0b7 add limit in case someone does really broad term 2020-03-18 20:27:10 -04:00
lickthecheese 848618f778 search bar 2020-03-18 20:02:21 -04:00
lickthecheese f251c397f4 basic searching 2020-03-18 19:34:29 -04:00
lickthecheese c5df2d871e works 2020-03-18 19:18:38 -04:00
lickthecheese fe624b0cda gitignore 2020-03-18 18:59:27 -04:00
lickthecheese 7fa8e8cd12 init 2020-03-18 18:25:24 -04:00