1
0
mirror of https://github.com/yacy/yacy_search_server.git synced 2025-05-12 21:59:33 -04:00

6085 Commits

Author SHA1 Message Date
Michael Peter Christen
c88c30a5c5 added an option to ViewFile to see all solr fields which contain texts 2024-09-21 21:51:19 +02:00
Michael Peter Christen
d181b9e89b added deleted files from commit 254f12d60b which are still needed and had been linked outside of yacy/ui 2024-07-24 15:57:51 +02:00
Michael Peter Christen
910a496c9f replaced http links with https 2024-07-21 18:02:58 +02:00
Michael Peter Christen
fd45ccf76e added sponsoring images 2024-07-21 17:25:22 +02:00
Michael Christen
2f5f3f8853
Merge pull request from zutto/master
Fix autocrawler crashing
2024-07-10 16:35:39 +02:00
okybaca
254f12d60b removed yacy/ui as obsolete 2024-07-09 16:26:50 +02:00
zutto
962aaec0c0 Improve the clarity of deep crawl feature UI text on AutoCrawler 2024-06-29 09:37:05 +03:00
EPS-DEV
160e346d4c fix: Remove SayAt.Me Link 2024-04-04 15:05:41 +00:00
Michael Peter Christen
3268a93019 added a 'minified' option to YaCy dumps 2023-11-13 10:27:50 +01:00
Michael Peter Christen
c20c4b8a21 modified export: added maximum number of docs per chunk
The export file can now be many files, called chunks.
By default still only one chunk is exported.
This function is required in case that the exported files shall be
imported to an elasticsearch/opensearch index. The bulk import function
of elasticsearch/opensearch is limited to 100MB. To make it possible to
import YaCy files, those must be splitted into chunks. Right now we
cannot estimate the chunk size as bytes, only as number of documents.
The user must do experiments to find out the optimum chunk max size,
like 50000 docs per chunk. Try this as first attempt.
2023-11-12 22:11:55 +01:00
Michael Peter Christen
655d8db802 detailed directions in index export to explain how the export can be
imported again using elasticsearch/opensearch
2023-11-12 15:26:18 +01:00
Michael Peter Christen
7db0534d8a Added a zim parser to the surrogate import option.
You can now import zim files into YaCy by simply moving them
to the DATA/SURROGATE/IN folder. They will be fetched and after
parsing moved to DATA/SURROGATE/OUT.
There are exceptions where the parser is not able to identify the
original URL of the documents in the zim file. In that case the file
is simply ignored.
This commit also carries an important fix to the pdf parser and an
increase of the maximum parsing speed to 60000 PPM which should make it
possible to index up to 1000 files in one second.
2023-11-05 02:16:40 +01:00
okybaca
4add1f6bc7 replaced all the links to legacy legacy wiki to legacy wiki 2023-10-29 13:12:24 +01:00
Michael Peter Christen
4308aa5415 removed concept of empty passwords as "no passwords used",
because we now start YaCy with a default password (yacy).
This has impact of all function that check the current state of
password-protection that included the empty password situation,
including the warnings to set a password in case that none is set (which
cannot be the case any more).
2023-10-25 22:56:06 +02:00
Michael Peter Christen
4da320bebf added a warning message in ConfigBasic in case that the default password
was not changed.
2023-10-24 23:36:26 +02:00
okybaca
4c1eb34e85 modified link to Process Scheduler in left menu 2023-10-10 08:30:04 +02:00
okybaca
08b769f63a modified crawl list so the URL links to external URL 2023-08-28 13:01:45 +02:00
Michael Peter Christen
0554056c63 added .txt search result page (just replace '.html' with '.txt' in yacysearch.html page to get a url list) 2023-08-19 14:57:31 +02:00
Michael Peter Christen
d8f26cb6a7 larger link structure image 2023-02-24 19:11:35 +01:00
Michael Peter Christen
9fcd8f1bda added canonical filter
attention: this is on by default!
(it should do the right thing)
2023-01-16 14:50:30 +01:00
Michael Peter Christen
5a52b01c09 front-end integration of tag valency 2023-01-15 20:13:45 +01:00
Michael Peter Christen
5acd98f4da introduction of tag-to-indexing relation TagValency 2023-01-13 17:20:18 +01:00
Michael Peter Christen
309adb814e fixed import of jsonlist imort from searchlab.eu using a direct URL 2022-10-25 00:51:53 +02:00
Michael Peter Christen
62d177bf59 stub for jsonlist index importer web page 2022-10-23 12:22:31 +02:00
Michael Peter Christen
761dbdf06d increases log history length to 10000
implements https://github.com/yacy/yacy_search_server/issues/512
2022-10-05 16:09:28 +02:00
Michael Peter Christen
60c9986a0e new release file names with date and git hash
...without reference to 9000ish SVN
2022-10-04 15:31:47 +02:00
Michael Peter Christen
d9e847a6b0 fixed html error 2022-10-02 23:42:54 +02:00
Michael Peter Christen
adbda4c71b moved all remaining servlet classes to new location 2022-10-02 23:22:12 +02:00
Michael Peter Christen
33889b4501 moved more servlets to new location 2022-10-02 22:57:58 +02:00
Michael Peter Christen
6d388bb7bf refactoring - moved htroot/yacy classes 2022-10-02 22:26:53 +02:00
Michael Peter Christen
48fcf3b3b5 alternative servlet method, tested with wiki
may become the future method to store servlets
2022-09-30 18:29:01 +02:00
Michael Peter Christen
d23dea2642 refactoring 2022-09-30 17:42:21 +02:00
Michael Peter Christen
a2a40a3096 new link to crawlstart api documentation 2022-09-29 00:25:51 +02:00
Michael Peter Christen
9c1bc533fa removed hazelcast because it is phoning home, see also:
https://github.com/yacy/yacy_search_server/issues/504
2022-09-28 17:30:37 +02:00
Michael Peter Christen
fc98ca7a9c removed ContentControl servlet and functinality
This was not used at all (as I know) and was blocking a smooth
integration of ivy in the context of an existing JSON parser.
2022-09-28 17:25:04 +02:00
Thomas Koch
3116713672 rm buildDate from build.xml and its usages
The https://reproducible-builds.org project invests a lot of work
to make builds reproducible. This is a security property. It allows
to compare the build of binaries from different builder machines.
If they are identical, it means that either the builds have not
been manipulated or an attacker managed to attack all builder
machines in exactly the same way.

One problem that the reproducible-builds project often sees is
that projects include the build time in their binaries. This
makes builds unreproducible for apparently no reason. The build
date should not be of interest since binaries built on different
dates but from the same source code should not be different.

Thus I decided to remove the build date instead of re-implementing
the functionality without the GitRev task. Anyways the reported
date was not the build date but the date of the last git commit
which is even less informative. The git commit ID would have
information value but should only be relevant for "nightly builds".
2022-07-10 11:32:38 +00:00
Thomas Koch
572558244a rm unused build properties PKGMANAGER, RESTARTCMD, DESTDIR
PKGMANAGER is always false, thus the java code wrapped in
if statements for this property is dead code and can also
be removed.

The Debian packaging removed in c4659f0fb01be0f68ce3dcccc1955c8662a5345f
did set the PKGMANAGER property to true. When we do distro
packages again, we can revisit this commit and redo it with
property files instead.

RESTARTCMD is only used inside those dead code.

DESTDIR is never used even in the build.xml
2022-07-10 10:14:51 +00:00
Michael Peter Christen
3d138d3fdd catch error when initializing hazelcast
should fix https://github.com/yacy/yacy_search_server/issues/468
2022-06-20 17:27:56 +02:00
thkoch2001
336100514d
Settings_HttpClient.inc spelling correction
certificats > certificates

Thanks to @CloudyProton
2022-04-07 17:33:16 +03:00
tangdou1
eae3674130
Update ConfigBasic.html 2022-02-28 20:57:58 +08:00
Burkhard
a6a9828181
Merge pull request from lfuelling/master
Add setting for public facing port
2022-02-11 08:09:17 +01:00
Burkhard
4219d729c3
Update SettingsAck_p.java
type in SwitchboardConstants.SERVER_PUBLICPORT
2022-02-11 08:04:55 +01:00
Burkhard
e0fd3d4f10
Update Settings_p.java
missing setting to display the value
2022-02-11 08:02:24 +01:00
reger24
84651f2925 Make bookmarks.html accessible
for the time beeing (as possibliy other decision has been made)
make the bookmarks feature accessible, as it is available but w/o link in UI

relates to https://github.com/yacy/yacy_search_server/issues/452#issuecomment-1033054368
2022-02-11 04:25:26 +01:00
reger24
a37352cfa7 Update link to Moby in DictionaryLoader_p.html
see issue https://github.com/yacy/yacy_search_server/issues/455
2022-02-11 03:19:30 +01:00
reger24
a7e93d9328 Add option to add host to default blacklist from search result
- added authorized ikon/button to blacklist a host
- host is added to default blacklist
- insired by https://github.com/yacy/yacy_search_server/issues/213#issuecomment-412485190
2022-02-09 19:42:04 +01:00
reger24
05d6d0405f Move sub-menu UI Translations from public Status to secure Sys Administration
- as UI Translation (TransNews_p.html) is a secured page
- it uses for publishing internal News system but belongs not really to "Community Data"
2022-02-08 22:42:11 +01:00
reger24
027e284ef9 Enhance notability of current blacklist by diff color in header
in servlet Blacklist_p.html
bugfix for 18dddb74c9
2022-02-06 09:43:59 +01:00
reger24
18dddb74c9 Harmonize loading/reading blacklist
between init  and servlet to use the same procedures
-added BlacklistHelper.blacklistToSortedArray to simplify use in servlet
2022-02-06 00:10:55 +01:00
reger24
11c4a1b45c Blacklist import from file, exclude comment lines
starting with  # // or ;
inspired by issue https://github.com/yacy/yacy_search_server/issues/446
2022-02-05 17:38:29 +01:00