37 Commits

Author SHA1 Message Date
bdd8cb5338 #include clean up Query.h 2016-12-08 16:56:09 +01:00
aba937780d Stop #including Conf.h from header files 2016-11-12 20:24:20 +01:00
8c9dac486b Add more logs & logTraceSummary 2016-11-11 16:49:19 +01:00
e8b4ba277d Fix bug in adding horizontal ellipsis 2016-11-11 16:34:53 +01:00
0c87a19507 Reduce minRemoveEllipsisLen. Do we even need this length? 2016-11-11 13:57:47 +01:00
6f844b7ed0 Add trace logs for Pos.cpp 2016-11-11 13:51:55 +01:00
4e7e67aff2 member init in Pos 2016-10-21 22:41:03 +02:00
91cf6435b5 init class members 2016-09-23 12:21:13 +02:00
b5da0d29ec Fix all caps PDF title 2016-06-09 14:52:18 +02:00
148fa6bc19 constness in Pos 2016-05-24 16:55:59 +02:00
e5df68eeb7 Constness in Pos 2016-05-24 12:16:40 +02:00
ad7bb591af Re-added private access specifier in Words class
Caused quite a lot of changes where other code hand its dirty hands in the
innards of Words. Added necessary accessor methods, and used the opportunity to
add const if possible.
2016-05-23 16:41:14 +02:00
b25e30e128 Partially fix broken unit test from ellipsis changes 2016-05-13 11:31:28 +02:00
2a2f832225 Use horizontal elipsis code poitn 2026
Instead of three dots.
2016-05-12 16:39:15 +02:00
52dd27ed82 Remove same punctuation in summary 2016-02-18 22:18:42 +01:00
ef195980d3 Treat meta content as HTML instead of plain text 2016-02-15 15:55:52 +01:00
7b9898f453 fix coredump when filtering weird summaries 2016-02-04 13:47:49 +01:00
61f4b27aeb Don't replace '>' & '<' to '|' when converting from HTML entities 2016-01-29 19:19:06 +01:00
4036c9ff3d Don't always replace <br> tag with '. '. We could end up with '.. '. 2016-01-28 13:09:58 +01:00
2b084cdd23 Fix ellipsis handling when we're less than 4 characters from limit. Fix unit test. 2016-01-20 13:49:30 +01:00
ac8249e07d Improve title/summary for youtube 2016-01-20 13:32:13 +01:00
46246af716 Trim ellipsis from title or summary. We'll add it outselves. 2016-01-19 16:58:17 +01:00
35817ef817 Fix bug where we're referencing uninitialized buffer 2016-01-18 20:16:06 +01:00
d3261c495a Add unit test for Pos::filter. Fix bug in previous commit where an all caps word will be uncapitalized. Instead of all caps buffer. 2016-01-18 19:09:38 +01:00
0117b2148e When all caps title/summary is encountered, capitalize only start of every 'word'. This is done only for all caps ascii to avoid handling special cases for now. 2016-01-18 17:16:45 +01:00
bed3182988 Use meta tags (og:title & title) & title tag when available for generating title 2016-01-15 15:53:14 +01:00
1c94c8c065 Skip getting meta tags from inside gbframe (expanded iframe) 2016-01-13 13:26:37 +01:00
9ae607ecf6 Try to get a nicer summary by using what the website set as description
Use the following in priority order (highest first)
 - itemprop = "description"
 - meta name = "og:description"
 - meta name = "description"
2016-01-12 15:49:37 +01:00
2c14f659e4 Remove similar/unused Words::set methods 2016-01-12 11:46:28 +01:00
2689ebc572 Remove emoticons from summary. Added system test to check for symbols removal.
Fix bug in emoticon detection.
Add unit test for emoticon detection.
2016-01-08 15:22:02 +01:00
7d0fa2385d Don't get summary text from 'script' / 'style' tags 2016-01-07 11:50:56 +01:00
0884edf08e Fix title for PDF files & add some simple tests for it 2015-12-01 12:38:51 +01:00
d53a4eb811 Remove commented out code, unused variable, general cleanup 2015-11-25 16:51:27 +01:00
87285ba3cd use gbmemcpy not memcpy so we can get profiler working again
since memcpy can't be interrupted and backtrace() called.
2015-01-13 12:25:42 -07:00
96b8197ad3 now it compiles with -m32 2014-11-10 14:45:11 -08:00
e7dd8f7956 replace long long with int64_t 2014-10-30 13:36:39 -06:00
f6e560c1f4 Initial file population. 2013-08-02 13:12:24 -07:00