web-search-api / networks

Commit History

Update networks/google_searcher.py
caebafc
verified

MAsad789565 commited on

Update networks/filepath_converter.py
e35d7eb
verified

MAsad789565 commited on

Update networks/webpage_fetcher.py
38e2473
verified

MAsad789565 commited on

:zap: [Enhance] Add ignore classes for wikipedia.org
dd996c1

Hansimov commited on

:recycle: [Refactor] Replace output_path with html_path to avoid confuse
8c0b736

Hansimov commited on

:boom: [Fix] WebpageFetcher: raise timeout when request.get hangs
bce51d4

Hansimov commited on

:zap: [Enhance] BatchWebpageFetcher: return url_and_output_path_list
4591d96

Hansimov commited on

:gem: [Feature] New BatchWebpageFetcher: Fetch multiple urls concurrently
e92817a

Hansimov commited on

:zap: [Enhance] WebpageContentExtractor: Escape dash, and ignore
c7c538d

Hansimov commited on

:zap: [Enhance] ignore classes pattern, especially for 163.com
3dda344

Hansimov commited on

:zap: [Enhance] Rename HTMLFetcher to WebpageFetcher, and add output_parent param
62ee9e4

Hansimov commited on

:recycle: [Refactor] WebpageContentExtractor: Separate html and markdown processing
a636bcb

Hansimov commited on

:recycle: [Refactor] Move hardcoded consts to network_configs
af2c647

Hansimov commited on

:zap: [Enhance] HTMLFetcher and GoogleSearcher: support cache with overwrite, and ignore host
cf4c3f8

Hansimov commited on

:recycle: [Refactor] HTMLFetcher: replace save_path with output_path
7d44e75

Hansimov commited on

:zap: [Enhance] GoogleSearcher: Add params of result_sum and safe
8bf48d8

Hansimov commited on

:zap: [Enhance] FilepathConverter: New parent param when init
f9c42cf

Hansimov commited on

:gem: [Feature] New HTMLFetcher: download url to local html file
b259fec

Hansimov commited on

:gem: [Feature] New FilepathConverter: convert urls and queries to valid file path
64a0dbf

Hansimov commited on

:recycle: [Refactor] Move header constructor, and prettier logging
e448a74

Hansimov commited on

:gem: [Feature] New GoogleSearcher: Enable google search with query
6cf0820

Hansimov commited on