zrowitsch
3ba8a20ac7
Separate sources that are used to build gb, put them in src, and put everything else in the junkdrawer dir. Start a new/clean Makefile.
2024-05-15 22:29:34 -04:00
twistdroach
81ede57b87
Rename Errno.h/Errno.cpp to GbErrno.h/GbErrno.cpp to keep from conflicting with errno.h on case-insensitive filesystems
2023-11-04 16:57:58 -04:00
Zachary D. Rowitsch
fe48f34bad
Minor changes to get gb to compile on macos, not perfect but better. Also minor changes to avoid immediate cores when running gb on macos
2023-11-03 19:41:35 -04:00
Dmitry Smirnov
50d2cf9bc1
Removed obsolete private libiconv ( Closes : #167 )
2021-05-05 12:56:40 +10:00
Dmitry Smirnov
c124eda914
cleanup: remove local zlib. All distros provide zlib1g-dev.
2021-05-05 10:36:21 +10:00
Dmitry Smirnov
e18d2396a6
Removed private OpenSSL [hygiene,FTBFS]. All distros provide OpenSSL.
2021-05-05 10:20:11 +10:00
Matt Wells
8cf5bdc8a2
force gb to recompile version every time
...
you do a make, so version is updated automatically.
2014-09-19 12:23:40 -07:00
mwells
df33d0c7e4
highlight version differences in the hosts table
2014-09-19 10:33:17 -06:00
Matt Wells
2137e150e7
Merge branch 'testing' into diffbot-matt
...
Conflicts:
Collectiondb.cpp
Make.depend
Parms.cpp
2014-06-27 17:17:14 -07:00
mwells
7506d66d4a
fixes for page inject
2014-06-15 08:26:27 -07:00
mwells
5c0b371dc9
Merge branch 'testing' into diffbot-matt
...
Conflicts:
Collectiondb.cpp
HttpServer.cpp
Make.depend
Parms.cpp
Parms.h
2014-06-13 11:00:09 -07:00
mwells
ea90e7f755
more fixes for sectiondb markup code
2014-06-12 13:05:45 -07:00
mwells
108c281c33
fix annoying bug when adding new parms.
2014-06-10 12:29:50 -07:00
Matt Wells
8aa0662a27
Merge branch 'diffbot' into testing
...
Conflicts:
Make.depend
PageResults.cpp
Parms.cpp
Spider.cpp
Spider.h
gb.conf
2014-03-08 09:38:44 -07:00
Matt Wells
953b7c558d
parm updates
2014-02-10 21:45:03 -07:00
Matt Wells
c9ef525338
code checkpoint
2014-02-09 12:55:45 -07:00
Matt Wells
8d534b8ed8
many more fixes for streaming mode
2014-02-06 18:21:22 -08:00
Matt Wells
6af9441818
change deduping logic to be first come first
...
server, but site rank trumps. fixed bug from
fix before.
2014-01-29 16:14:42 -08:00
Matt Wells
313cffc322
had to add per round page and process counts
...
in case they had maxToCrawl and respider frequencies
set. simplified round logic in Spider.cpp.
2014-01-23 13:23:09 -08:00
Matt Wells
58d0c444ac
fixes for the global index quota system
2014-01-19 19:38:23 -08:00
Matt Wells
3ec44c5b35
fix streaming mode for sending back json
...
downloads/dumps.
2014-01-17 18:28:17 -08:00
Matt Wells
4b27b22949
git rebalancing working right
2014-01-15 17:40:17 -08:00
Matt Wells
8a49e87a61
got code with shard rebalancing compiling.
...
now we store a "sharded by termid" bit in posdb
key for checksums, etc keys that are not sharded
by docid. save having to do disk seeks on every
host in the cluster to do a dup check, etc.
2014-01-11 16:08:42 -08:00
mwells
76bb3d05e1
clean up logging so i can see what's going on
2013-12-10 16:41:30 -08:00
mwells
82494baa89
move CollectionRec stuff into Collectiondb files
...
for simplicity.
2013-12-10 15:28:04 -08:00
Matt Wells
e0a15194e1
fix json double decoding issue. no more
...
partial decodes, json parser stores
fully decoded string into separate buf.
2013-11-22 14:16:14 -08:00
Matt Wells
b589b17e63
fix collection resetting.
2013-10-18 15:21:00 -07:00
Matt Wells
a288217e9f
a few bug fixes
2013-10-17 18:59:00 -07:00
Matt Wells
9d6c3626d8
json indexing/hashing updates.
2013-10-16 15:41:12 -07:00
mwells
43e4c939eb
Merge branch 'master' into diffbot
...
Conflicts:
Make.depend
2013-10-02 13:15:07 -06:00
Matt Wells
c911a606c9
renamed matches.h and matches.cpp to
...
matches2.h and matches2.cpp to avoid potential
confusion with Matches.h and Matches.cpp files.
2013-10-01 07:58:24 -07:00
mwells
d11e9520bd
couple fixes to makefile etc.
2013-09-28 16:37:39 -06:00
mwells
fd081478de
fix crawlbot to work on a distributed network
...
as far as adding/deleting/resetting colls
and updating parms. ideally we'd have a Colldb
Rdb where each key was a parm. that would make
syncing easier if a host went down, then it would
get the negative/positive colldb parm keys later.
so it could sync up on all your operations as long
as all your operations in terms of adding and deleting
database key/value pairs.
2013-09-26 22:41:05 -06:00
Matt Wells
f90d20f4dd
diffbot api integration updates
2013-09-18 15:07:47 -07:00
Matt Wells
f974d6a47b
fixes for crawlbot universal api.
2013-09-16 10:49:37 -07:00
mwells
e152205765
make depend update
2013-09-09 02:37:47 -06:00
mwells
ca2a024d04
fixed up thread/spider log msgs.
...
fixed core from calling fprintf in
alarm signal missed quickpoll handler.
2013-08-29 21:15:42 -06:00
mwells
4f4047a3ad
new Make.depend.
2013-08-09 17:13:45 -06:00
Matt Wells
f6e560c1f4
Initial file population.
2013-08-02 13:12:24 -07:00