23 Commits

Author SHA1 Message Date
c61bbd256c init class members in Json 2016-09-26 19:32:57 +02:00
9a9e764938 Remove niceness form json 2016-09-23 22:29:38 +02:00
8dab0db467 Remove extra semicolons 2016-05-19 18:37:26 +02:00
de3f884fe2 Remove undefined function 2016-03-08 23:28:32 +01:00
ab0b9d03ea Standardize header guards 2016-03-08 22:16:02 +01:00
d43bc2d92b Add const for some Json encode/decode methods (used in unit test) 2015-11-12 12:10:39 +01:00
2060be357e Move existing JSON test into unit test 2015-11-11 17:10:32 +01:00
c947252fee Add gbcapturedate to individual doc's metadata when injecting warcs. 2015-10-04 01:53:54 -06:00
9642947136 fix so host will delete then re-add collections
that use the same collnum but have a different name.
fixed some unlabelled safebufs.
fix core when deleting collnum from tree/buckets that
is higher than Collectiondb.m_numRecs.
fix File::m_filename safebufs that were not freed on exit.
2015-08-18 14:09:16 -07:00
7b507a70ef Set value length to 0 for something that does not return a string value
in Json.cpp.
Fix the '-' -> '_' when indexing generic fields.
Add a StackBuf macro which is a Safebuf initialized with a small
stack buffer for use in a local scope.
2015-06-30 14:09:57 -06:00
3e5218c54c fix gbssDocId:123456789, et al, query. will only work for docs indexed
after applying this fix.
2015-04-13 14:13:16 -06:00
e346a14a47 added logic to retry diffbot reply on connection reset,
connection timed out or gateway timed out (http status 504)
msgs.  added logic to detect truncated json (missing final })
and not print it. also, at index time, we set a diffbot missing
curly error to g_errno so the whole url can be retried later.
2015-03-09 20:54:34 -07:00
4c19453ea9 working with -m32 for basic testing.
compiles for 64-bit.
2014-11-12 11:38:37 -08:00
96b8197ad3 now it compiles with -m32 2014-11-10 14:45:11 -08:00
c5ae5ca4b5 v3 support for tokenized diffbot replies
using the "objects" array in the json.
2014-05-12 16:13:24 -07:00
9c26b85c2f fixed contenthash32 logic for json objects.
fixed hashing of numbers/bools for json objects.
added m_dupCache to reduce spiderrequests added to spiderdb.
do not add urls to waitingtree if ufn is obviously filtered/banned.
do not spider spiderrequest from doledb is maxoutperip would
be violated.
2014-02-05 13:22:03 -08:00
e0a15194e1 fix json double decoding issue. no more
partial decodes, json parser stores
fully decoded string into separate buf.
2013-11-22 14:16:14 -08:00
fbcd6b8afd display json objects that are not in arrays
in csv. show csv header. how to deal
with heterogenous object lists?
index spiderdate: for gbsortby:spiderdate.
added gbrevsortby: support.
2013-11-12 13:51:52 -08:00
a288217e9f a few bug fixes 2013-10-17 18:59:00 -07:00
9d6c3626d8 json indexing/hashing updates. 2013-10-16 15:41:12 -07:00
a562c65627 another code checkpoint. new json api
for crawlbot. new url filters for crawlbot.
2013-10-14 16:10:48 -06:00
0de777d80d parser fixes 2013-10-11 17:35:12 -06:00
6d5643e185 json parsing 2013-10-11 16:14:26 -06:00