YaCy Release 1.82

The main enhancements of this release are: IPv6 routing of Peer-to-Peer
network elements and overall IPv6 enhancements, added self stats logging
(showing the number of peers in the network over time), enhancement of
embedded http server (less loading by browser using expires dates),
document date detection, document snapshots (an optional document
archive created by the crawler), partial updates to solr during
postprocessing and more postprocessing enhancements, new UPNP classes,
updates to many external libraries such as pdf parser and jetty.
Furthermore, there is now a release notes web pages which shows all the
details at http://yacy.net/release_notes/

Major Changes   
Jump to: Bugfixes / Other Changes

CommitDescription
Mon Jan 19 03:30:35 CET 2015
by reger
refactor opensearch heuristic
introduce FederateSearchManager handling search heuristic to external systems via specific FederateSearchConnectors,
which provide the query() functionallity, the translation to YaCy schema .toYaCySchema() and the search() routine to deliver results to searchevents, which is generally implemented in Abstract connector.
The manager enforces now a min 15s delay between calls to external systems.
Besides the OpensearchConnector a SolrFederateSearchConnector is available. It uses a additional config file for fieldname translation.

default heuristicopensearch.conf: 
- openbdb.com removed - seems not longer to deliver results
- config via solrconnector to  datacite.org added (large technical library archive)
Changed Files: defaults/federatecfg/datacite.solr.schema, defaults/heuristicopensearch.conf, htroot/ConfigHeuristics_p.java, htroot/ConfigNetwork_p.java, htroot/yacysearch.java, source/net/yacy/cora/federate/AbstractFederateSearchConnector.java, source/net/yacy/cora/federate/FederateSearchConnector.java, source/net/yacy/cora/federate/FederateSearchManager.java, source/net/yacy/cora/federate/SolrFederateSearchConnector.java, source/net/yacy/cora/federate/opensearch/OpenSearchConnector.java, source/net/yacy/kelondro/data/meta/URIMetadataNode.java, source/net/yacy/search/Switchboard.java
Sun Jan 04 18:47:47 CET 2015
by sixcooler
bump to Solr-/Lucene-4.10.3
Changed Files: .classpath, addon/YaCy.app/Contents/Info.plist, build.xml, defaults/solr/solrconfig.xml, lib/lucene-analyzers-common-4.10.3.jar, lib/lucene-analyzers-phonetic-4.10.3.jar, lib/lucene-classification-4.10.3.jar, lib/lucene-codecs-4.10.3.jar, lib/lucene-core-4.10.3.jar, lib/lucene-facet-4.10.3.jar, lib/lucene-grouping-4.10.3.jar, lib/lucene-highlighter-4.10.3.jar, lib/lucene-join-4.10.3.jar, lib/lucene-memory-4.10.3.jar, lib/lucene-misc-4.10.3.jar, lib/lucene-queries-4.10.3.jar, lib/lucene-queryparser-4.10.3.jar, lib/lucene-spatial-4.10.3.jar, lib/lucene-suggest-4.10.3.jar, lib/solr-core-4.10.3.jar, lib/solr-solrj-4.10.3.jar, nbproject/project.xml, pom.xml
Sun Jan 04 11:10:45 CET 2015
by reger
Added a ?don't store remote search results? option
This is intended for peers who want to participate in the P2P network but don't wish to load/fill-up their index with metadata of every received search result. 
The DHT transfer is not effected by this option (and will work as usual, so that a peer disabling the new store to index switch still receives and holds the metadata according to DHT rules).
Downside for the local peer is that search speed will not improve if search terms are only avail. remote or by quick hits in local index.

To be able to improve the local index a Click-Servlet option was added additionally.
If switched on, all search result links point to this servlet, which forwards the users browser (by html header) to the desired page and feeds the page to the fulltext-index.
The servlet accepts a parameter defining the action to perform (see defaults/web.xml, index, crawl, crawllinks)

The option check-boxes are placed in ConfigPortal.html
Changed Files: defaults/web.xml, defaults/yacy.init, htroot/ConfigPortal.html, htroot/ConfigPortal.java, htroot/yacysearchitem.java, source/net/yacy/http/servlets/ClickServlet.java, source/net/yacy/peers/Protocol.java, source/net/yacy/search/SwitchboardConstants.java, source/net/yacy/search/query/SearchEvent.java, source/net/yacy/search/query/SearchEventCache.java
Sun Dec 21 18:10:15 CET 2014
by Michael Peter Christen
added experimental pdf splitting which enables YaCy to split pdfs during
parsing into individual pages and add them all using different URLs.
These constructed urls are generated from the source url with an
appended page=<pagenumber> attribute to the url get/post properties.
This will distinguish the different page entries. The search result list
will then replace the post parameter with a url anchor # mark which
causes that the original url is presented in the search result. These
URLs can be opened directly on the correct page using pdf.js which is
now built-in into firefox. That means: if you find a search hit on page
5 and click on the search result, firefox will open the pdf viewer and
shows page 5.
Changed Files: source/net/yacy/crawler/retrieval/SitemapImporter.java, source/net/yacy/document/parser/pdfParser.java, source/net/yacy/peers/Protocol.java, source/net/yacy/search/Switchboard.java, source/net/yacy/search/SwitchboardConstants.java, source/net/yacy/search/snippet/ResultEntry.java
Fri Dec 19 21:54:17 CET 2014
by reger
update to Jetty 9.2.6
Changed Files: .classpath, addon/YaCy.app/Contents/Info.plist, build.xml, lib/jetty-9.2.6.v20141205.License, lib/jetty-client-9.2.6.v20141205.jar, lib/jetty-continuation-9.2.6.v20141205.jar, lib/jetty-deploy-9.2.6.v20141205.jar, lib/jetty-http-9.2.6.v20141205.jar, lib/jetty-io-9.2.6.v20141205.jar, lib/jetty-jmx-9.2.6.v20141205.jar, lib/jetty-proxy-9.2.6.v20141205.jar, lib/jetty-security-9.2.6.v20141205.jar, lib/jetty-server-9.2.6.v20141205.jar, lib/jetty-servlet-9.2.6.v20141205.jar, lib/jetty-servlets-9.2.6.v20141205.jar, lib/jetty-util-9.2.6.v20141205.jar, lib/jetty-webapp-9.2.6.v20141205.jar, lib/jetty-xml-9.2.6.v20141205.jar, nbproject/project.xml
Fri Dec 19 02:54:38 CET 2014
by reger
update to PDFBox 1.8.8
Changed Files: .classpath, addon/YaCy.app/Contents/Info.plist, build.xml, lib/fontbox-1.8.8.License, lib/fontbox-1.8.8.jar, lib/jempbox-1.8.8.License, lib/jempbox-1.8.8.jar, lib/pdfbox-1.8.8.License, lib/pdfbox-1.8.8.jar, nbproject/project.xml, pom.xml
Mon Dec 15 23:32:46 CET 2014
by Michael Peter Christen
Added a transaction interface to the snapshots: all documents in the
snapshots can now be processed with transactions using commit and
rollback commands. Furthermore, a large number of monitoring methods had
been added to check the success of transactions.

The transactions for snapshots have two main components: a rss search
API to get information about latest/oldest entries and a commit/rollback
API to move entries away from the rss results. This is done by usage of
two storage locations for the snapshots, INVENTORY and ARCHIVE. New
snapshots are placed to INVENTORY, commited snapshots move to ARCHIVE,
rollback snapshots move to INVENTORY again.

Normal Workflow:
Beside all these options below, usually it is sufficient to process data
like this:
- call
http://localhost:8090/api/snapshot.rss?state=INVENTORY&order=LATESTFIRST
- process the rss result and use the <guid> value as <urlhash> (see next
command)
- for each processed result call
http://localhost:8090/api/snapshot.json?command=commit&urlhash=<urlhash>
- then you can call the rss feed again and the commited urls are omited
from the next set of items.

These are the commands to control this:
The rss feed:
http://localhost:8090/api/snapshot.rss?state=INVENTORY&order=LATESTFIRST
http://localhost:8090/api/snapshot.rss?state=INVENTORY&order=OLDESTFIRST
http://localhost:8090/api/snapshot.rss?state=INVENTORY&order=ANY
http://localhost:8090/api/snapshot.rss?state=ARCHIVE&order=LATESTFIRST
http://localhost:8090/api/snapshot.rss?state=ARCHIVE&order=OLDESTFIRST
http://localhost:8090/api/snapshot.rss?state=ARCHIVE&order=LATESTFIRST

The feed will return a <urlhash> in the <guid> - field of the rss. This
must be used for commit/rollback:

Commit/Rollback:
http://localhost:8090/api/snapshot.json?command=commit&urlhash=<urlhash>
http://localhost:8090/api/snapshot.json?command=rollback&urlhash=<urlhash>
The json will return a property list containing the property "result"
with possible values "success" or "fail", according of the result. If an
"fail" occurs, please look into the log for further info.

Monitoring:
http://localhost:8090/api/snapshot.json?command=status
This shows the total number of entries in the INVENTORY and the ARCHIVE 
http://localhost:8090/api/snapshot.json?command=list
This will result a list of all hosts which have snapshots and the number
of entries for the hosts. Counts for INVENTORY and ARCHIVE are listed in
the porperties for "count.INVENTORY" and "count.ARCHIVE"
http://localhost:8090/api/snapshot.json?command=list&depth=2
The list can be restricted to such which have a specific depth. The list
contains then the same host names, but the count values change because
only documents at that specific crawl depth are listed
http://localhost:8090/api/snapshot.json?command=list&host=yacy.net.80
This lists all urlhashes for the given host, not only an accumulated
list of the number of entries
http://localhost:8090/api/snapshot.json?command=list&host=yacy.net.80&depth=0
This restricts the list of urlhashes for that host for the given depth
http://localhost:8090/api/snapshot.json?command=list&state=INVENTORY
http://localhost:8090/api/snapshot.json?command=list&state=ARCHIVE
This selects either the INVENTORY or ARCHIVE for all list commands,
default is ALL which means that from both snapshot directories the host
information is collected and combined. You can use the state option for
all the commands as listed above

Detailed Information:
http://localhost:8090/api/snapshot.json?command=metadata&urlhash=upiFJ7Fh1hyQ
This collects metadata information for the given urlhash. This can also
be restricted with state=INVENTORY and state=ARCHIVE to test if the
document is either in one of these snapshot directories. If an urlhash
is not found, an empty result is returned. If an entry was found and the
state was not restricted, then the result contains a state property
containing the name of the location where the document is, either
INVENTORY or ARCHIVE.

Hint:
If a very large number of documents is inside of INVENTORY, then it
could be better to call the rss feed with
http://localhost:8090/api/snapshot.rss?state=INVENTORY&order=ANY
because that is very efficient.
Changed Files: htroot/api/snapshot.java, source/net/yacy/crawler/data/Snapshots.java, source/net/yacy/crawler/data/Transactions.java
Mon Dec 15 20:45:05 CET 2014
by reger
update to metadata-extractor-2.7.0.jar
add 2 simple JUnit test cases for jpeg and tif parsing
Changed Files: .classpath, addon/YaCy.app/Contents/Info.plist, build.xml, lib/metadata-extractor-2.7.0.License, lib/metadata-extractor-2.7.0.jar, lib/xmpcore-5.1.2.jar, nbproject/project.xml, pom.xml, source/net/yacy/document/parser/images/genericImageParser.java, source/net/yacy/document/parser/images/metadataImageParser.java, test/net/yacy/document/parser/images/genericImageParserTest.java, test/net/yacy/document/parser/images/metadataImageParserTest.java, test/parsertest/YaCyLogo_120ppi.jpg, test/parsertest/YaCyLogo_120ppi.tif
Sun Dec 14 13:40:45 CET 2014
by Michael Peter Christen
Added and integrated new date detection class which can identify date
notions within the fulltext of a document. This class attempts to
identify also dates given abbreviated or with missing year or described
with names for special days, like 'Halloween'. In case that a date has
no year given, the current year and following years are considered.

This process is therefore able to identify a large set of dates to a
document, either because there are several dates given in the document
or the date is ambiguous. Four new Solr fields are used to store the
parsing result:

dates_in_content_sxt:
if date expressions can be found in the content, these dates are listed
here in order of the appearances

dates_in_content_count_i:
the number of entries in dates_in_content_sxt

date_in_content_min_dt:
if dates_in_content_sxt is filled, this contains the oldest date from
the list of available dates

#date_in_content_max_dt:
if dates_in_content_sxt is filled, this contains the youngest date from
the list of available dates, that may also be possibly in the future

These fields are deactiviated by default because the evaluation of
regular expressions to detect the date is yet too CPU intensive. Maybe
future enhancements will cause that this is switched on by default.

The purpose of these fields is the creation of calendar-like search
facets, to be implemented next.
Changed Files: defaults/solr.collection.schema, source/net/yacy/cora/date/GenericFormatter.java, source/net/yacy/crawler/data/Transactions.java, source/net/yacy/data/ymark/YMarkAutoTagger.java, source/net/yacy/document/Condenser.java, source/net/yacy/document/content/SurrogateReader.java, source/net/yacy/document/parser/torrentParser.java, source/net/yacy/search/Switchboard.java, source/net/yacy/search/index/DocumentIndex.java, source/net/yacy/search/index/Segment.java, source/net/yacy/search/schema/CollectionConfiguration.java, source/net/yacy/search/schema/CollectionSchema.java
Tue Dec 09 16:20:34 CET 2014
by Michael Peter Christen
enhanced the snapshot functionality:

- snapshots can now also be xml files which are extracted from the solr
index and stored as individual xml files in the snapshot directory along
the pdf and jpg images
- a transaction layer was placed above of the snapshot directory to
distinguish snapshots into 'inventory' and 'archive'. This may be used
to do transactions of index fragments using archived solr search results
between peers. This is currently unfinished, we need a protocol to move
snapshots from inventory to archive
- the SNAPSHOT directory was renamed to snapshot and contains now two
snapshot subdirectories: inventory and archive
- snapshots may now be generated by everyone, not only such peers
running on a server with xkhtml2pdf installed. The expert crawl starts
provides the option for snapshots to everyone. PDF snapshots are now
optional and the option is only shown if xkhtml2pdf is installed.
- the snapshot api now provides the request for historised xml files,
i.e. call:
http://localhost:8090/api/snapshot.xml?urlhash=Q3dQopFh1hyQ
The result of such xml files is identical with solr search results with
only one hit.
The pdf generation has been moved from the http loading process to the
solr document storage process. This may slow down the process a lot and
a different version of the process may be needed. 
Changed Files: htroot/CrawlStartExpert.html, htroot/CrawlStartExpert.java, htroot/Crawler_p.java, htroot/QuickCrawlLink_p.java, htroot/api/snapshot.java, source/net/yacy/cora/federate/solr/responsewriter/EnhancedXMLResponseWriter.java, source/net/yacy/cora/util/Html2Image.java, source/net/yacy/crawler/CrawlSwitchboard.java, source/net/yacy/crawler/data/CrawlProfile.java, source/net/yacy/crawler/data/Snapshots.java, source/net/yacy/crawler/data/Transactions.java, source/net/yacy/data/ymark/YMarkCrawlStart.java, source/net/yacy/document/parser/html/TransformerWriter.java, source/net/yacy/repository/LoaderDispatcher.java, source/net/yacy/search/Switchboard.java, source/net/yacy/search/index/DocumentIndex.java, source/net/yacy/search/index/Segment.java
Fri Dec 05 01:13:37 CET 2014
by reger
ViewFile servlet: update index if newer,
so viewed text and metadata (stored) info is similar
- to archive it, use request with profile to allow indexing (defaultglobaltext) and update index 
   (the resource is loaded, parsed anyway, so it's not a expensive operation)

Request: remove 2 unused init parameter 
- number of anchors of the parent
- forkfactor sum of anchors of all ancestors
Changed Files: htroot/HostBrowser.java, htroot/QuickCrawlLink_p.java, htroot/ViewFile.java, htroot/api/push_p.java, htroot/rct_p.java, source/net/yacy/crawler/CrawlStacker.java, source/net/yacy/crawler/data/CrawlQueues.java, source/net/yacy/crawler/retrieval/Request.java, source/net/yacy/crawler/retrieval/SitemapImporter.java, source/net/yacy/data/ymark/YMarkCrawlStart.java, source/net/yacy/http/ProxyCacheHandler.java, source/net/yacy/http/ProxyHandler.java, source/net/yacy/repository/LoaderDispatcher.java, source/net/yacy/search/Switchboard.java, source/net/yacy/server/http/HTTPDProxyHandler.java
Wed Dec 03 11:45:48 CET 2014
by Michael Peter Christen
added a servlet which can create preview images, preview tumbnails and
preview pdfs from web pages, i.e.:
http://localhost:8090/api/snapshot.png?url=http://yacy.net/en/&width=128&height=128
http://localhost:8090/api/snapshot.jpg?url=http://yacy.net/en/&width=128&height=128
http://localhost:8090/api/snapshot.pdf?url=http://yacy.net/en/

This supports also an on-the-fly generation of the preview documents if
the user is an administrator. Otherwise, the servlet fails.
To enable this, you must add wkhtmltopdf, imagemagick and (on headless
servers) xvfb to your operation system.

for detailed instructions, see
https://gitorious.org/yacy/rc1/commit/97f6089a41a4ed40aef84f692690e30f50585f5d
Changed Files: htroot/api/snapshot.java, source/net/yacy/cora/util/Html2Image.java, source/net/yacy/crawler/data/Snapshots.java, source/net/yacy/http/servlets/YaCyDefaultServlet.java
Tue Dec 02 16:26:07 CET 2014
by Michael Peter Christen
Replaced all fixed thread pools with cached thread pools. The cached
thread pools will flush their cached (dead) threads after 60 seconds.
This will cause that YaCy now runs constantly withl about 50 threads,
about 100 at peak times. Previously, about 400 threads had been cached
and kept in a hibernation state, which caused that the numproc counter
in /proc/user_beancounters (exists only in VM-hosted linux) was as high
as the cached number of threads. This caused that VM supervisors
terminated whole VM sessions if a limit was reached. Many VM providers
have limits of numproc=96 which made it virtually impossible to run YaCy
on such machines. With this change, it will be possible to run many YaCy
instances even on VM hosts. 
Changed Files: source/net/yacy/cora/protocol/Domains.java, source/net/yacy/cora/protocol/http/HTTPClient.java, source/net/yacy/document/importer/MediawikiImporter.java, source/net/yacy/kelondro/blob/ArrayStack.java
Mon Dec 01 15:03:09 CET 2014
by Michael Peter Christen
YaCy can now create web page snapshots as pdf documents which can later
be transcoded into jpg for image previews. To create such pdfs you must
do:

Add wkhtmltopdf and imagemagick to your OS, which you can do:
On a Mac download wkhtmltox-0.12.1_osx-cocoa-x86-64.pkg from
http://wkhtmltopdf.org/downloads.html and downloadh
ttp://cactuslab.com/imagemagick/assets/ImageMagick-6.8.9-9.pkg.zip
In Debian do "apt-get install wkhtmltopdf imagemagick"

Then check in /Settings_p.html?page=ProxyAccess: "Transparent Proxy" and
"Always Fresh" - this is used by wkhtmltopdf to fetch web pages using
the YaCy proxy. Using "Always Fresh" it is possible to get all pages
from the proxy cache.

Finally, you will see a new option when starting an expert web crawl.
You can set a maximum depth for crawling which should cause a pdf
generation. The resulting pdfs are then available in
DATA/HTCACHE/SNAPSHOTS/<host>.<port>/<depth>/<shard>/<urlhash>.<date>.pdf
Changed Files: htroot/CrawlStartExpert.html, htroot/CrawlStartExpert.java, htroot/Crawler_p.java, htroot/QuickCrawlLink_p.java, source/net/yacy/cora/util/Html2Image.java, source/net/yacy/crawler/CrawlSwitchboard.java, source/net/yacy/crawler/data/CrawlProfile.java, source/net/yacy/crawler/data/Snapshots.java, source/net/yacy/crawler/retrieval/HTTPLoader.java, source/net/yacy/data/ymark/YMarkCrawlStart.java, source/net/yacy/http/ProxyCacheHandler.java, source/net/yacy/search/Switchboard.java
Sat Nov 29 11:56:32 CET 2014
by Michael Peter Christen
added new web page snapshot infrastructure which will lead to the
ability to have web page previews in the search results.
(This is a stub, no function available with this yet...)
Changed Files: htroot/Crawler_p.java, htroot/QuickCrawlLink_p.java, source/net/yacy/crawler/CrawlSwitchboard.java, source/net/yacy/crawler/data/CrawlProfile.java, source/net/yacy/crawler/data/Snapshots.java, source/net/yacy/crawler/retrieval/HTTPLoader.java, source/net/yacy/data/ymark/YMarkCrawlStart.java, source/net/yacy/kelondro/index/BufferedObjectIndex.java, source/net/yacy/kelondro/util/OS.java, source/net/yacy/search/Switchboard.java
Fri Nov 28 20:24:39 CET 2014
by reger
update to Jetty 9.2.4
Changed Files: .classpath, addon/YaCy.app/Contents/Info.plist, build.xml, lib/jetty-9.2.4.v20141103.License, lib/jetty-client-9.2.4.v20141103.jar, lib/jetty-continuation-9.2.4.v20141103.jar, lib/jetty-deploy-9.2.4.v20141103.jar, lib/jetty-http-9.2.4.v20141103.jar, lib/jetty-io-9.2.4.v20141103.jar, lib/jetty-jmx-9.2.4.v20141103.jar, lib/jetty-proxy-9.2.4.v20141103.jar, lib/jetty-security-9.2.4.v20141103.jar, lib/jetty-server-9.2.4.v20141103.jar, lib/jetty-servlet-9.2.4.v20141103.jar, lib/jetty-servlets-9.2.4.v20141103.jar, lib/jetty-util-9.2.4.v20141103.jar, lib/jetty-webapp-9.2.4.v20141103.jar, lib/jetty-xml-9.2.4.v20141103.jar, nbproject/project.xml, pom.xml
Mon Nov 24 20:28:52 CET 2014
by Michael Peter Christen
added option to make the YaCy proxy act as the cache is never stale. If
set to 'Always Fresh' the cache is always used if the entry in the cache
exist. This is a good way to archive web content and access it without
going online again in case the documents exist.
To do so, open /Settings_p.html?page=ProxyAccess and check the "Always
Fresh" checkbox.
This is set do false which behave as set before.
If you set this to true, then you have your web archive in DATA/HTCACHE.
Copy this to carry around your private copy of the internet!
Changed Files: defaults/yacy.init, htroot/SettingsAck_p.html, htroot/SettingsAck_p.java, htroot/Settings_ProxyAccess.inc, htroot/Settings_p.java, source/net/yacy/crawler/retrieval/Response.java
Wed Nov 19 17:36:56 CET 2014
by Michael Peter Christen
added loading of the synonyms file from addon/synonyms into the
knowledge loader
Changed Files: htroot/DictionaryLoader_p.html, htroot/DictionaryLoader_p.java, source/net/yacy/cora/language/synonyms/SynonymLibrary.java, source/net/yacy/data/ymark/YMarkAutoTagger.java, source/net/yacy/document/Condenser.java, source/net/yacy/document/LibraryProvider.java, source/net/yacy/document/parser/torrentParser.java, source/net/yacy/search/Switchboard.java, source/net/yacy/search/index/DocumentIndex.java, source/net/yacy/search/index/Segment.java
Thu Nov 13 00:58:58 CET 2014
by Michael Peter Christen
added new 'firstSeen' database table and necessary data structures which
hold a date for each URL to record when a url was first seen. This is
then used to overwrite the modification date for urls upon recrawl in
case that the first-seen date is before the latest document date. This
behaviour is necessary due to the common behaviour of content management
systems which attach always the current date to all documents. Using the
firstSeen database it is possible to approximate a real first document
creation date in case that the crawler starts frequently for the same
domain. As a result the search results ordered by date have a much
better quality and the usage of YaCy as search agent for latest news has
a better quality.
Changed Files: htroot/IndexControlURLs_p.html, htroot/IndexControlURLs_p.java, htroot/ViewFile.html, htroot/ViewFile.java, source/net/yacy/peers/Protocol.java, source/net/yacy/search/Switchboard.java, source/net/yacy/search/index/Segment.java, source/net/yacy/search/schema/CollectionConfiguration.java
Sun Nov 09 23:06:36 CET 2014
by sixcooler
update to httpclient-4.3.6
Changed Files: .classpath, addon/YaCy.app/Contents/Info.plist, build.xml, lib/dependencies.txt, lib/httpclient-4.3.6.License, lib/httpclient-4.3.6.jar, lib/httpmime-4.3.6.License, lib/httpmime-4.3.6.jar, nbproject/project.xml, pom.xml
Fri Nov 07 18:51:31 CET 2014
by sixcooler
update to solr-/lucene-4.10.2
Changed Files: .classpath, addon/YaCy.app/Contents/Info.plist, build.xml, defaults/solr/solrconfig.xml, lib/commons-codec-1.9.License, lib/commons-codec-1.9.jar, lib/lucene-analyzers-common-4.10.2.jar, lib/lucene-analyzers-phonetic-4.10.2.jar, lib/lucene-classification-4.10.2.jar, lib/lucene-codecs-4.10.2.jar, lib/lucene-core-4.10.2.jar, lib/lucene-facet-4.10.2.jar, lib/lucene-grouping-4.10.2.jar, lib/lucene-highlighter-4.10.2.jar, lib/lucene-join-4.10.2.jar, lib/lucene-memory-4.10.2.jar, lib/lucene-misc-4.10.2.jar, lib/lucene-queries-4.10.2.jar, lib/lucene-queryparser-4.10.2.jar, lib/lucene-spatial-4.10.2.jar, lib/lucene-suggest-4.10.2.jar, lib/lucene.License, lib/solr-core-4.10.2.jar, lib/solr-solrj-4.10.2.jar, lib/solr.License, pom.xml, source/net/yacy/search/index/Fulltext.java
Fri Oct 24 15:04:40 CEST 2014
by Michael Peter Christen
fix for exact_signature_unique_b, exact_signature_copycount_i,
fuzzy_signature_unique_b and fuzzy_signature_copycount_i: apply same
criteria for 'valid document' as for title and description uniqueness
test.
Changed Files: source/net/yacy/cora/federate/solr/logic/AbstractOperations.java, source/net/yacy/cora/federate/solr/logic/BooleanLiteral.java, source/net/yacy/cora/federate/solr/logic/CatchallLiteral.java, source/net/yacy/cora/federate/solr/logic/Conjunction.java, source/net/yacy/cora/federate/solr/logic/Disjunction.java, source/net/yacy/cora/federate/solr/logic/Literal.java, source/net/yacy/cora/federate/solr/logic/LongLiteral.java, source/net/yacy/cora/federate/solr/logic/StringLiteral.java, source/net/yacy/search/schema/CollectionConfiguration.java
Tue Oct 14 12:19:59 CEST 2014
by Michael Peter Christen
added partial updates to solr during postprocessing: during
postprocessing the solr documents are now not completely retrieved.
instead, only fiels, needed for the postprocessing are extracted. When
Solr document are written, this is done using partial updates.

This increases postprocessing speed by about 50% for embedded Solr
configurations. For external Solr configurations the enhancement should
be much higher because the postprocessing with remote Solr is very slow.
When doing partial updates to a remote Solr, this method should perform
much better than before, it is expected that this is even much higher
than the increase with local Solr.
Changed Files: source/net/yacy/cora/federate/solr/connector/AbstractSolrConnector.java, source/net/yacy/cora/federate/solr/connector/EmbeddedSolrConnector.java, source/net/yacy/cora/federate/solr/connector/SolrServerConnector.java, source/net/yacy/search/schema/CollectionConfiguration.java
Tue Oct 07 17:51:07 CEST 2014
by Michael Peter Christen
Added an experimental audio feedback system.
This is the first element of a new 'decoration' component which may hold
switches for different external appearance parameters.
The first switch in that context is decoration.audio (as usual in
yacy.init). This value is set to false by default, that means the audio
feedback element is switched off by default. To switch it on, set
decoration.audio = true (using /ConfigProperties_p.html). You will then
hear sounds for the following events:
- remote searches
- incoming dht transmissions
- new documents from the crawler
Sound clips are stored in htroot/env/soundclips/ which is done so
because a future implementation will read these files using the http
client and with configurable urls which will make it very easy for the
user to replace the given sounds with own sounds.
Changed Files: defaults/yacy.init, htroot/env/soundclips/atmocrawling.wav, htroot/env/soundclips/atmomonitor.wav, htroot/env/soundclips/dhtin.wav, htroot/env/soundclips/newdoc.wav, htroot/env/soundclips/remotesearch.wav, htroot/env/soundclips/sources.txt, htroot/yacy/search.java, htroot/yacy/transferURL.java, source/net/yacy/search/Switchboard.java, source/net/yacy/search/SwitchboardConstants.java
Tue Oct 07 13:10:06 CEST 2014
by Marc Nause
Finished implementation of UPNP:

*) will try other ports if YaCy standard ports are not available
*) distinguish between internal and external port (not sure if this
works 100%)

Still to add: propery in config to enter own external port (in case of
manually configured NAT)
Changed Files: htroot/ConfigBasic.java, htroot/ConfigPortal.java, htroot/ConfigSearchBox.java, htroot/CrawlStartScanner_p.java, htroot/Load_MediawikiWiki.java, htroot/Load_PHPBB3.java, htroot/SettingsAck_p.java, htroot/Settings_p.java, htroot/Table_API_p.java, htroot/api/push_p.java, htroot/opensearchdescription.java, htroot/yacysearch.java, htroot/yacysearch_location.java, source/net/yacy/gui/Tray.java, source/net/yacy/http/Jetty9HttpServerImpl.java, source/net/yacy/peers/Seed.java, source/net/yacy/peers/SeedDB.java, source/net/yacy/search/Switchboard.java, source/net/yacy/server/http/HTTPDemon.java, source/net/yacy/server/serverSwitch.java, source/net/yacy/utils/upnp/UPnP.java, source/net/yacy/yacy.java
Wed Oct 01 03:10:39 CEST 2014
by Michael Peter Christen
IPv6-enhanced Network monitoring page
Changed Files: htroot/Network.html, htroot/Network.java, htroot/Network.xml, htroot/env/grafics/NodeDisqualified.gif, htroot/env/grafics/NodeDisqualifiedIPv4.gif, htroot/env/grafics/NodeDisqualifiedIPv6.gif, htroot/env/grafics/NodeQualified.gif, htroot/env/grafics/NodeQualifiedIPv4.gif, htroot/env/grafics/NodeQualifiedIPv6.gif, source/net/yacy/peers/Network.java, source/net/yacy/peers/Protocol.java
Tue Sep 30 14:53:52 CEST 2014
by Michael Peter Christen
large IPv6 redesign of peer ping methods!
removed preferred IPv4 in start options and added a new field IP6 in
peer seeds which will contain one or more IPv6 addresses. Now every peer
has one or more IP addresses assigned, even several IPv6 addresses are
possible. The peer-ping process must check all given and possible IP
addresses for a backping and return the one IP which was successful when
pinging the peer. The ping-ing peer must be able to recognize which of
the given IPs are available for outside access of the peer and store
this accordingly. If only one IPv6 address is available and no IPv4,
then the IPv6 is stored in the old IP field of the seed DNA.
Many methods in Seed.java are now marked as @deprecated because they had
been used for a single IP only. There is still a large construction site
left in YaCy now where all these deprecated methods must be replaced
with new method calls. The 'extra'-IPs, used by cluster assignment had
been removed since that can be replaced with IPv6 usage in p2p clusters.
All clusters must now use IPv6 if they want an intranet-routing.
Changed Files: addon/YaCy.app/Contents/Info.plist, addon/yacyInit.m4, htroot/Blog.java, htroot/BlogComments.java, htroot/ConfigNetwork_p.java, htroot/CrawlStartScanner_p.java, htroot/MessageSend_p.java, htroot/Messages_p.java, htroot/Network.html, htroot/Network.java, htroot/SettingsAck_p.java, htroot/Status.java, htroot/Surftips.java, htroot/ViewProfile.java, htroot/Wiki.java, htroot/goto_p.java, htroot/mediawiki_p.java, htroot/yacy/hello.java, htroot/yacy/query.java, installYaCyWindowsService.bat, source/net/yacy/cora/protocol/Domains.java, source/net/yacy/cora/protocol/RequestHeader.java, source/net/yacy/cora/protocol/http/HTTPClient.java, source/net/yacy/cora/util/CommonPattern.java, source/net/yacy/crawler/CrawlStacker.java, source/net/yacy/http/YacyDomainHandler.java, source/net/yacy/migration.java, source/net/yacy/peers/DHTSelection.java, source/net/yacy/peers/Network.java, source/net/yacy/peers/PeerActions.java, source/net/yacy/peers/Protocol.java, source/net/yacy/peers/RemoteSearch.java, source/net/yacy/peers/Seed.java, source/net/yacy/peers/SeedDB.java, source/net/yacy/peers/Transmission.java, source/net/yacy/search/Switchboard.java, source/net/yacy/search/query/SearchEvent.java, source/net/yacy/search/query/SearchEventCache.java, source/net/yacy/search/snippet/TextSnippet.java, source/net/yacy/server/http/AlternativeDomainNames.java, source/net/yacy/server/serverAccessTracker.java, source/net/yacy/server/serverSwitch.java, source/net/yacy/yacy.java, startYACY.sh, startYACY_debug.bat
Sun Sep 28 03:18:18 CEST 2014
by reger
update to PDFBox 1.8.7
Changed Files: .classpath, addon/YaCy.app/Contents/Info.plist, build.xml, lib/fontbox-1.8.7.License, lib/fontbox-1.8.7.jar, lib/jempbox-1.8.7.License, lib/jempbox-1.8.7.jar, lib/pdfbox-1.8.7.License, lib/pdfbox-1.8.7.jar, pom.xml
Sat Sep 27 23:27:05 CEST 2014
by reger
update to Jetty 9.2.3
Changed Files: addon/YaCy.app/Contents/Info.plist, build.xml, htroot/Settings_ServerAccess.inc, lib/jetty-9.2.3.v20140905.License, lib/jetty-client-9.2.3.v20140905.jar, lib/jetty-continuation-9.2.3.v20140905.jar, lib/jetty-deploy-9.2.3.v20140905.jar, lib/jetty-http-9.2.3.v20140905.jar, lib/jetty-io-9.2.3.v20140905.jar, lib/jetty-jmx-9.2.3.v20140905.jar, lib/jetty-proxy-9.2.3.v20140905.jar, lib/jetty-security-9.2.3.v20140905.jar, lib/jetty-server-9.2.3.v20140905.jar, lib/jetty-servlet-9.2.3.v20140905.jar, lib/jetty-servlets-9.2.3.v20140905.jar, lib/jetty-util-9.2.3.v20140905.jar, lib/jetty-webapp-9.2.3.v20140905.jar, lib/jetty-xml-9.2.3.v20140905.jar, pom.xml
Wed Sep 17 13:58:55 CEST 2014
by Michael Peter Christen
changed the concurrent enumeration of query results in such a way that
it is now possible to get the results in two steps:
- first retrieve all IDs as given for a query
- then retieve each document individually

This was necessary for very large result sets where a query may run for
hours and is possibly terminated by a solr-internal timeout. This occurs
regulary during postprocessing and therefore this commit may fix
unwanted postprocessing terminations.
Changed Files: htroot/HostBrowser.java, htroot/IndexDeletion_p.java, source/net/yacy/cora/federate/solr/connector/AbstractSolrConnector.java, source/net/yacy/cora/federate/solr/connector/ConcurrentUpdateSolrConnector.java, source/net/yacy/cora/federate/solr/connector/SolrConnector.java, source/net/yacy/search/index/Fulltext.java, source/net/yacy/search/index/Segment.java, source/net/yacy/search/schema/CollectionConfiguration.java, source/net/yacy/search/schema/HyperlinkGraph.java
Wed Sep 17 12:54:50 CEST 2014
by Michael Peter Christen
added a 'stats' table which records some peer statistics twice every
hour. The table can be shown with
http://localhost:8090/Tables_p.html?table=stats

The entries have the following meaning: 
aM: activeLastMonth
aW: activeLastWeek
aD: activeLastDay
aH: activeLastHour
cC: countConnected (Active Senior)
cD: countDisconnected (Passive Senior)
cP: countPotential (Junior)
cR: count of the RWI entries
cI: size of the index (number of documents)

The entry keys are abbreviated to reduce the space in the table as the
name is written again for every row.

This is the beginning of a 'yacystats' micro-alternative als built-in
function in YaCy. Graphics may follow after some time if enough test
data is available.
Changed Files: source/net/yacy/cora/date/GenericFormatter.java, source/net/yacy/data/WorkTables.java, source/net/yacy/peers/SeedDB.java, source/net/yacy/search/Switchboard.java


Bugfixes   
Jump to: YaCy Release 1.82 top / Other Changes

CommitDescription
Tue Jan 20 17:14:14 CET 2015
by Michael Peter Christen
fixed font size and print page generation in pdf snapshots
Changed Files: htroot/api/snapshot.java, source/net/yacy/cora/protocol/ClientIdentification.java, source/net/yacy/cora/util/Html2Image.java, source/net/yacy/crawler/data/Transactions.java, source/net/yacy/search/index/Segment.java
Mon Jan 12 00:35:47 CET 2015
by Michael Peter Christen
fix for mediawiki import
Changed Files: htroot/IndexImportMediawiki_p.html, htroot/IndexImportMediawiki_p.java, source/net/yacy/search/Switchboard.java, source/net/yacy/search/index/Segment.java
Fri Jan 09 02:52:18 CET 2015
by reger
remove debug limit from commit before
Changed Files: source/net/yacy/search/AutoSearch.java
Tue Jan 06 14:21:20 CET 2015
by Michael Peter Christen
fix for division by zero (rare cases)
Changed Files: htroot/NetworkHistory.java, source/net/yacy/visualization/ChartPlotter.java
Fri Dec 26 18:23:26 CET 2014
by reger
fix "null" title in response writer for documents with multivalued title
Changed Files: source/net/yacy/cora/federate/solr/responsewriter/HTMLResponseWriter.java
Tue Dec 23 00:30:34 CET 2014
by Michael Peter Christen
NPE fix
Changed Files: htroot/ConfigPortal.java, htroot/ViewLog_p.java
Mon Dec 22 14:32:09 CET 2014
by Michael Peter Christen
fix for pdf sub-page result preparation
Changed Files: source/net/yacy/cora/document/id/MultiProtocolURL.java
Mon Dec 22 14:24:09 CET 2014
by Michael Peter Christen
removed debug code
Changed Files: source/net/yacy/document/parser/pdfParser.java
Mon Dec 22 02:01:55 CET 2014
by Michael Peter Christen
fix to wkhtmltopdf usage
Changed Files: source/net/yacy/cora/util/Html2Image.java
Sun Dec 21 20:11:39 CET 2014
by Michael Peter Christen
fixes to wkhtmltopdf call
Changed Files: source/net/yacy/cora/util/Html2Image.java
Sun Dec 21 19:02:36 CET 2014
by Michael Peter Christen
prevent NPE during initialization of very large vocabularies
Changed Files: htroot/yacysearch.java, source/net/yacy/document/LibraryProvider.java
Sun Dec 21 17:53:06 CET 2014
by Michael Peter Christen
removed debug lines
Changed Files: htroot/ConfigParser.html, htroot/ConfigParser.java, htroot/yacy/hello.java
Sat Dec 20 15:11:06 CET 2014
by Michael Peter Christen
fix for division by zero
Changed Files: htroot/yacy/hello.java
Tue Dec 16 12:10:15 CET 2014
by Michael Peter Christen
fix for image parser (there is a class missing!)
Changed Files: source/net/yacy/document/parser/images/genericImageParser.java
Tue Dec 16 11:33:30 CET 2014
by Michael Peter Christen
fix for a count issue in snapshot api
Changed Files: htroot/api/snapshot.java, source/net/yacy/crawler/data/Snapshots.java, source/net/yacy/crawler/data/Transactions.java
Sun Dec 14 04:03:20 CET 2014
by Michael Peter Christen
fixes on wkhtmltopdf
Changed Files: htroot/api/snapshot.java, source/net/yacy/cora/util/Html2Image.java
Wed Dec 10 14:09:34 CET 2014
by Michael Peter Christen
fix for vocabulary import (double term detection)
Changed Files: htroot/Vocabulary_p.java
Wed Dec 10 13:14:39 CET 2014
by Michael Peter Christen
fix for Is Facet checkbox
Changed Files: htroot/Vocabulary_p.java
Mon Dec 08 01:35:37 CET 2014
by reger
prevent NPE on host link for to short HeuristicCfg.OpenSearchURL
Changed Files: htroot/ConfigHeuristics_p.java
Sat Dec 06 02:25:24 CET 2014
by reger
fix startup stop on missing HTCACHE/SNAPSHOT directory
Changed Files: source/net/yacy/search/Switchboard.java
Sat Dec 06 00:43:12 CET 2014
by Michael Peter Christen
npe fix
Changed Files: htroot/api/snapshot.java
Fri Nov 28 22:44:33 CET 2014
by reger
fix (enable) error msg on empty query
Changed Files: htroot/yacysearch.html, htroot/yacysearch.java
Fri Nov 28 01:19:31 CET 2014
by Michael Peter Christen
show vocabularies in search result (in case of debugging)
Changed Files: htroot/yacysearchitem.html, htroot/yacysearchitem.java
Fri Nov 28 01:19:01 CET 2014
by Michael Peter Christen
security bugfix
Changed Files: defaults/yacy.init, source/net/yacy/search/Switchboard.java
Mon Nov 24 20:53:40 CET 2014
by Michael Peter Christen
toString fix
Changed Files: source/net/yacy/http/ProxyHandler.java
Tue Nov 18 12:11:18 CET 2014
by Michael Peter Christen
fix field counter for multi-fields in html writer for the solr servlet
Changed Files: source/net/yacy/cora/federate/solr/responsewriter/HTMLResponseWriter.java
Thu Nov 13 00:59:30 CET 2014
by Michael Peter Christen
fix for wildcard patch in search queries
Changed Files: source/net/yacy/search/query/QueryModifier.java
Wed Nov 12 22:48:33 CET 2014
by Michael Peter Christen
fix for catchall handling in search
Changed Files: htroot/yacysearch.java
Mon Nov 10 18:52:01 CET 2014
by Michael Peter Christen
fix for bad table iteration
Changed Files: htroot/NetworkHistory.java, htroot/Table_YMark_p.java, htroot/Tables_p.java, htroot/api/table_p.java, source/net/yacy/kelondro/blob/BEncodedHeap.java, source/net/yacy/kelondro/blob/Tables.java, source/net/yacy/kelondro/index/RowSet.java
Mon Nov 10 02:18:44 CET 2014
by Michael Peter Christen
html fix
Changed Files: htroot/Tables_p.html
Fri Nov 07 18:12:09 CET 2014
by Michael Peter Christen
added long variables to debug output in index browser
Changed Files: htroot/NetworkHistory.java
Fri Nov 07 18:11:49 CET 2014
by Michael Peter Christen
fix of bad query generation for search facets
Changed Files: source/net/yacy/search/query/QueryModifier.java
Fri Nov 07 18:11:23 CET 2014
by Michael Peter Christen
fix for bad query generation in doublecheck in postprocessing
Changed Files: htroot/HostBrowser.java, source/net/yacy/cora/federate/solr/logic/Negation.java, source/net/yacy/search/schema/CollectionConfiguration.java
Sun Nov 02 13:28:10 CET 2014
by Michael Peter Christen
emergency bugfix for 100% CPU in image drawing
Changed Files: source/net/yacy/visualization/ChartPlotter.java
Fri Oct 31 23:17:56 CET 2014
by Michael Peter Christen
fix for bad timing computation in postprocessing
Changed Files: source/net/yacy/search/schema/CollectionConfiguration.java
Thu Oct 30 20:53:57 CET 2014
by orbiter
more fixes and enhancements to postprocessing
Changed Files: source/net/yacy/cora/federate/solr/connector/AbstractSolrConnector.java, source/net/yacy/cora/federate/solr/connector/EmbeddedSolrConnector.java
Thu Oct 30 15:47:44 CET 2014
by orbiter
enhanced debug code in host browser
Changed Files: htroot/HostBrowser.java
Thu Oct 30 15:20:35 CET 2014
by orbiter
fix for npe (in rare cases)
Changed Files: source/net/yacy/cora/federate/solr/connector/AbstractSolrConnector.java
Thu Oct 30 15:01:27 CET 2014
by orbiter
fix for literal computation
Changed Files: source/net/yacy/cora/federate/solr/logic/LongLiteral.java
Thu Oct 30 12:41:04 CET 2014
by Michael Peter Christen
fix for broken protocol navigation
Changed Files: htroot/yacysearchtrailer.java
Wed Oct 29 16:52:58 CET 2014
by orbiter
more IPv6 fixes
Changed Files: readme.txt, source/net/yacy/cora/protocol/Domains.java, source/net/yacy/cora/protocol/ftp/FTPClient.java
Tue Oct 28 15:36:13 CET 2014
by Michael Peter Christen
IPv6 fix
Changed Files: source/net/yacy/cora/protocol/Domains.java, source/net/yacy/peers/Seed.java, source/net/yacy/server/serverSwitch.java
Wed Oct 22 11:25:07 CEST 2014
by sixcooler
fix for ConnectionInfo.cleanup of server-connections
Changed Files: source/net/yacy/cora/protocol/ConnectionInfo.java
Fri Oct 17 21:32:07 CEST 2014
by orbiter
fix for bad json
Changed Files: htroot/yacy/seedlist.java
Tue Oct 14 12:48:15 CEST 2014
by Michael Peter Christen
npe fix
Changed Files: source/net/yacy/search/schema/CollectionConfiguration.java
Mon Oct 13 16:53:00 CEST 2014
by Michael Peter Christen
fix for api icon in yacysearch_location.html
Changed Files: htroot/yacysearch_location.html
Mon Oct 13 14:28:11 CEST 2014
by Michael Peter Christen
fixed location search
Changed Files: source/net/yacy/cora/document/feed/RSSMessage.java, source/net/yacy/search/schema/CollectionConfiguration.java
Sat Oct 11 00:34:07 CEST 2014
by Michael Peter Christen
more ipv6 fixes
Changed Files: htroot/MessageSend_p.java, source/net/yacy/peers/Protocol.java, source/net/yacy/peers/SeedDB.java
Wed Oct 08 15:48:45 CEST 2014
by Michael Peter Christen
fixed appearance of RSS icon on search result page
Changed Files: htroot/env/base.css, htroot/yacysearch.html, skins/27c3.css, skins/28c3.css, skins/generic_pd.css, skins/pdblue.css, skins/pdbootstrap.css
Wed Oct 08 15:22:29 CEST 2014
by Michael Peter Christen
misc bugfixes (concurrency, memory protection)
Changed Files: source/net/yacy/cora/protocol/ConnectionInfo.java, source/net/yacy/gui/Audio.java, source/net/yacy/kelondro/table/Table.java
Wed Oct 08 15:21:49 CEST 2014
by Michael Peter Christen
more ipv6 bugfixes
Changed Files: htroot/HostBrowser.java, source/net/yacy/cora/document/id/DigestURL.java, source/net/yacy/cora/document/id/MultiProtocolURL.java, source/net/yacy/crawler/robots/RobotsTxt.java
Wed Oct 08 13:44:03 CEST 2014
by Michael Peter Christen
fix for local search
Changed Files: source/net/yacy/peers/RemoteSearch.java
Wed Oct 08 12:38:56 CEST 2014
by Michael Peter Christen
more ipv6 bugfixes
Changed Files: htroot/Blog.java, htroot/BlogComments.java, htroot/Bookmarks.java, htroot/ConfigPortal.java, htroot/ConfigRobotsTxt_p.java, htroot/ConfigSearchBox.java, htroot/Load_MediawikiWiki.java, htroot/Load_PHPBB3.java, htroot/Messages_p.java, htroot/News.java, htroot/Status.java, htroot/Table_YMark_p.java, htroot/Tables_p.java, htroot/ViewProfile.java, htroot/Wiki.java, htroot/api/bookmarks/get_bookmarks.java, htroot/api/bookmarks/get_folders.java, htroot/api/ynetSearch.java, htroot/goto_p.java, htroot/mediawiki_p.java, htroot/yacy/hello.java, htroot/yacy/seedlist.java, htroot/yacysearch_location.java, source/net/yacy/peers/Network.java, source/net/yacy/peers/Protocol.java, source/net/yacy/peers/Seed.java, source/net/yacy/peers/SeedDB.java, source/net/yacy/search/snippet/ResultEntry.java
Tue Oct 07 23:30:32 CEST 2014
by Michael Peter Christen
fix for remote search process
Changed Files: source/net/yacy/cora/federate/solr/connector/ConcurrentUpdateSolrConnector.java, source/net/yacy/peers/Protocol.java, source/net/yacy/peers/RemoteSearch.java, source/net/yacy/server/serverAccessTracker.java
Tue Oct 07 22:16:18 CEST 2014
by Michael Peter Christen
fix for bad node flag setting with IPv6
Changed Files: source/net/yacy/cora/protocol/Domains.java, source/net/yacy/http/ProxyHandler.java, source/net/yacy/peers/Protocol.java
Tue Oct 07 21:57:41 CEST 2014
by Michael Peter Christen
ipv6 fixes for Network.html front page
Changed Files: htroot/Network.html, htroot/Network.java
Tue Oct 07 20:09:48 CEST 2014
by orbiter
more ipv6 fixes
Changed Files: source/net/yacy/peers/Protocol.java, source/net/yacy/peers/SeedDB.java
Tue Oct 07 18:53:23 CEST 2014
by Michael Peter Christen
more ipv6 fixes
Changed Files: source/net/yacy/cora/protocol/Domains.java, source/net/yacy/peers/Seed.java
Tue Oct 07 17:52:13 CEST 2014
by Michael Peter Christen
fix for latest UPnP update
Changed Files: htroot/yacysearch_location.java
Mon Oct 06 17:44:27 CEST 2014
by Michael Peter Christen
more IPv6 bugfixes
Changed Files: htroot/MessageSend_p.java, htroot/Network.java, source/net/yacy/cora/document/feed/RSSReader.java, source/net/yacy/cora/protocol/Domains.java, source/net/yacy/crawler/data/Cache.java, source/net/yacy/peers/Network.java, source/net/yacy/peers/PeerActions.java, source/net/yacy/peers/Protocol.java, source/net/yacy/peers/Seed.java, source/net/yacy/peers/SeedDB.java, source/net/yacy/peers/Transmission.java, source/net/yacy/server/http/HTTPDemon.java
Sun Oct 05 11:03:57 CEST 2014
by Michael Peter Christen
toString fixes
Changed Files: source/net/yacy/data/WorkTables.java
Fri Oct 03 08:51:23 CEST 2014
by reger
fix char encoding parameter in UrlProxy
Changed Files: source/net/yacy/http/servlets/UrlProxyServlet.java
Wed Oct 01 15:34:43 CEST 2014
by Michael Peter Christen
unresolved pattern fix
Changed Files: htroot/CrawlProfileEditor_p.xml
Wed Oct 01 15:32:10 CEST 2014
by Michael Peter Christen
ipv6 fixes
Changed Files: htroot/Network.html, htroot/Network.java, htroot/env/templates/submenuComputation.template, source/net/yacy/cora/protocol/Domains.java, source/net/yacy/http/AbstractRemoteHandler.java, source/net/yacy/peers/Protocol.java
Wed Oct 01 12:22:55 CEST 2014
by Michael Peter Christen
fix for xss bugs found by CTF365
Changed Files: htroot/yacyinteractive.java, htroot/yacysearch.java
Wed Oct 01 10:21:03 CEST 2014
by Michael Peter Christen
IPv6 host parsing bugfixes
Changed Files: htroot/IndexCreateQueues_p.java, htroot/Settings_p.java, source/net/yacy/cora/document/id/MultiProtocolURL.java, source/net/yacy/cora/protocol/Domains.java, source/net/yacy/cora/protocol/HeaderFramework.java, source/net/yacy/http/AbstractRemoteHandler.java, source/net/yacy/http/YacyDomainHandler.java, source/net/yacy/peers/Protocol.java, source/net/yacy/server/http/HTTPDProxyHandler.java, source/net/yacy/server/http/HTTPDemon.java
Fri Sep 26 22:34:11 CEST 2014
by reger
update German translation http://mantis.tokeek.de/view.php?id=447
Changed Files: locales/de.lng
Sat Sep 20 13:06:46 CEST 2014
by Michael Peter Christen
fix for crawl limit for number of pages fail
Changed Files: source/net/yacy/crawler/CrawlStacker.java, source/net/yacy/crawler/HostQueue.java
Tue Sep 16 17:46:07 CEST 2014
by Michael Peter Christen
fix for Mac.app config - but does still not run. looking for build bug.
Changed Files: addon/YaCy.app/Contents/Info.plist


Other Changes   
Jump to: YaCy Release 1.82 top / Bugfixes

CommitDescription
Wed Jan 21 12:45:55 CET 2015
by Michael Peter Christen
Release 1.82
Changed Files: build.properties
Tue Jan 20 18:18:12 CET 2015
by Michael Peter Christen
reverted 'do not show all options' strategy. This is actually confusing
new users. Will be activated maybe again if there is an optional
tutorial mode which can be switched on for this special purpose of
running a tutorial.
Changed Files: source/net/yacy/http/servlets/YaCyDefaultServlet.java
Fri Jan 09 16:45:43 CET 2015
by Michael Peter Christen
a test with http://validator.w3.org/feed/#validate_by_input shows that
the time format was wrong; we must use RFC-822
Changed Files: source/net/yacy/cora/document/feed/RSSMessage.java, source/net/yacy/cora/protocol/HeaderFramework.java
Fri Jan 09 02:06:30 CET 2015
by reger
Add option for extended search (Autosearch) to Bookmark.html asking all connected peers for the searchterm added as description to the bookmark created by the bookmark icon.
Intended for searches/research projects with not sufficient results from local and DHT selected remote target peers.

Function: the process checks newly created bookmarks for description starting with "query=..." and takes this to ask every peer for 20 search results and adds it to the local index in a background job.
link to start/stop the process added to /Bookmarks.html
Changed Files: htroot/Bookmarks.html, htroot/Bookmarks.java, source/net/yacy/data/BookmarksDB.java, source/net/yacy/search/AutoSearch.java
Fri Jan 09 01:33:45 CET 2015
by reger
Add title import for bookmark icon
if avail in index
Changed Files: htroot/yacysearch.java
Fri Jan 09 01:31:57 CET 2015
by reger
- add javadoc to busythread with hint about the init parameter useage
- remove obsolete 10_httpd config parameter
Changed Files: htroot/PerformanceQueues_p.java, source/net/yacy/kelondro/workflow/AbstractBusyThread.java
Tue Jan 06 15:22:59 CET 2015
by Michael Peter Christen
documents pushed over the api/push_p.html interface will have their
unique flag set by default
Changed Files: source/net/yacy/crawler/CrawlSwitchboard.java, source/net/yacy/crawler/data/CrawlProfile.java, source/net/yacy/search/index/Segment.java, source/net/yacy/search/schema/CollectionConfiguration.java
Tue Jan 06 14:22:43 CET 2015
by Michael Peter Christen
better scale
Changed Files: htroot/NetworkHistory.java
Tue Jan 06 14:14:25 CET 2015
by Michael Peter Christen
do not write frame links to webgraph
Changed Files: source/net/yacy/document/Document.java, source/net/yacy/document/parser/html/ContentScraper.java
Mon Jan 05 09:10:20 CET 2015
by reger
revert clickservlet 
(default was indeed a mistakenly)
Changed Files: defaults/web.xml, defaults/yacy.init, htroot/ConfigPortal.html, htroot/ConfigPortal.java, htroot/yacysearchitem.java, source/net/yacy/search/SwitchboardConstants.java
Mon Jan 05 08:21:51 CET 2015
by Michael Peter Christen
do not use the clickservlet by default. From my personal view, this
technique should not be used at all! This project is about privacy, the
existence of a click servlet is one example why people should NOT use a
search portal if such exists. 
Changed Files: defaults/yacy.init
Mon Jan 05 08:18:19 CET 2015
by Michael Peter Christen
please commit new files under your own name, this file was not created
by me.
Changed Files: source/net/yacy/http/servlets/ClickServlet.java
Mon Jan 05 06:55:53 CET 2015
by reger
added url to bookmark icon link
url is anyway needed, saves index lookup and works w/o commited url.
Removed unused order parameter
Changed Files: htroot/yacysearch.java, htroot/yacysearchitem.java
Sun Jan 04 09:12:30 CET 2015
by reger
fix NPE in viewimage

Caused by: java.lang.NullPointerException
	at net.yacy.peers.graphics.EncodedImage.<init>(EncodedImage.java:73)
	at ViewImage.respond(ViewImage.java:156)
Changed Files: htroot/ViewImage.java
Sun Jan 04 06:57:13 CET 2015
by reger
fix ConfigPortal jumps to iframe focus
add focus parameter to yacysearch.html too
Changed Files: htroot/yacysearch.html, htroot/yacysearch.java
Sun Jan 04 02:59:21 CET 2015
by reger
add info text to metadata page (htmlresponsewriter) on no documents found
Changed Files: source/net/yacy/cora/federate/solr/responsewriter/HTMLResponseWriter.java
Fri Jan 02 04:20:02 CET 2015
by reger
improve TexParser.mimeOf( fileextension ) by returning 1st defined in supported list.
This prevents unusual mapping of supported fileextension -> mimetype
(like htm=application/x-tex)
Changed Files: source/net/yacy/document/AbstractParser.java
Fri Jan 02 02:44:03 CET 2015
by Michael Peter Christen
do not write iframe and embed links into webgraph, but use them anyway
for crawling
Changed Files: source/net/yacy/document/Document.java, source/net/yacy/document/parser/html/ContentScraper.java, source/net/yacy/search/schema/CollectionConfiguration.java
Fri Jan 02 00:11:32 CET 2015
by Ryszard Go?
Fix for progress table background not resizing
when the post-processing started/ended.
Changed Files: htroot/Crawler_p.html
Thu Jan 01 02:41:20 CET 2015
by reger
adjustments for Bookmark icon to act on BookmarkDB,
it acts on YMarks but YMark interface seems not maintained,
for future features (e.g. query memory) BookmarkDB is the likely choice to expand, besides the crawlstart bookmark also the result bookmark icon now adds to BookmarkDB.
The YMark related code is (for now) left untouched so both tables are updated.

Changed Files: htroot/yacysearch.java, htroot/yacysearchitem.java
Mon Dec 29 03:50:00 CET 2014
by reger
remove obsolete config footer option (ConfigPortal user.login)
no footer or footer-option in use

remove unused yacy.init item allowUnlimitedReceiveIndexFrom
Changed Files: defaults/yacy.init, htroot/ConfigPortal.html, htroot/ConfigPortal.java
Sun Dec 28 15:52:43 CET 2014
by Michael Peter Christen
reacivated clear stacks code for termination of all crawls because this
did not work wihtout that part of the code
Changed Files: htroot/Crawler_p.java
Sun Dec 28 15:48:37 CET 2014
by Michael Peter Christen
do not flush non-errors to stdout because this is a concurrency issue.
the flush-call appeared very often in thread dumps with high load, so
this hopefully gives some performances
Changed Files: source/net/yacy/kelondro/logging/ConsoleOutErrHandler.java
Sun Dec 28 14:53:55 CET 2014
by Michael Peter Christen
do not translate gif images into png images for thumbnails. Instead,
stream the original to the search result thumb viewer. This has two
reasons:
- animated gifs cause 100% cpu and deadlocks in the jvm gif parser; a
known bug which is obviously not yet fixed
- animated gifs now appear in the search result also as animation
Changed Files: htroot/ViewImage.java, htroot/yacysearchitem.java, source/net/yacy/search/query/QueryModifier.java
Sun Dec 28 14:36:43 CET 2014
by Michael Peter Christen
automatically set the Q flag for smb/ftp start urls (split pdf support)
Changed Files: htroot/js/IndexCreate.js
Sun Dec 28 14:27:42 CET 2014
by Michael Peter Christen
automatically swith on query option in case intranet protocols (smb/ftp)
are used. This supports the new split-pdf option.
Changed Files: htroot/Crawler_p.java
Sat Dec 27 09:52:34 CET 2014
by arucard21
Applied URL-decoding prior to HTML-encoding.
This removes percent-encoding from text shown in HTML
Changed Files: source/net/yacy/server/serverObjects.java
Sat Dec 27 03:02:18 CET 2014
by Ryszard Go?
Postprocessing progress bar fix
(Make it work as [probably] actually intended)
Changed Files: htroot/Crawler_p.html, htroot/js/Crawler.js
Sat Dec 27 00:10:14 CET 2014
by reger
Init Jetty using setDefaultDescriptor (web.xml) to defaults/web.xml
so web.xml in defaults dir is applied first and optional DATA/SETTINGS/web.xml loaded on top.
By using this Jetty feature (default web.xml) we assure that changes to the default are applied to existing installations
and individual addition/changes are still respected.
Changed Files: defaults/web.xml, source/net/yacy/http/Jetty9HttpServerImpl.java
Fri Dec 26 18:21:35 CET 2014
by reger
adjust fieldtype and description of field httpstatus_redirect_s in CollectionSchema
- the field is not used (delete candidate)
Changed Files: source/net/yacy/search/schema/CollectionSchema.java
Thu Dec 25 02:21:45 CET 2014
by reger
fix NPE related 500 (Bad Request) response of UrlProxy on blacklisted urls,
by adding parameter HTTPDeamon and removing unused hostAddress lookup code in sendRespondError
Changed Files: source/net/yacy/http/servlets/UrlProxyServlet.java, source/net/yacy/server/http/HTTPDemon.java
Thu Dec 25 02:16:19 CET 2014
by reger
improve yacysearchitem, 
prevent allocation of String (modifyURL) if feature not used
Changed Files: htroot/yacysearchitem.java
Thu Dec 25 02:13:44 CET 2014
by reger
add xmpcore as direct dependency to pom
(otherwise it's looked up at pdfbox archive path and not found there)
Changed Files: pom.xml
Wed Dec 24 12:23:59 CET 2014
by Michael Peter Christen
crawling of multi-page pdfs with artificial post part on smb or ftp
shares is not possible with the disabled setting; this is not temporary
disabled until a better solution is on the hand.
Changed Files: htroot/js/IndexCreate.js
Wed Dec 24 00:04:35 CET 2014
by reger
fix div by 0 in hello
Caused by: java.lang.ArithmeticException: / by zero
	at hello.respond(hello.java:159)
Changed Files: htroot/yacy/hello.java
Tue Dec 23 19:11:21 CET 2014
by reger
update to SLF4J 1.7.9
Changed Files: .classpath, addon/YaCy.app/Contents/Info.plist, build.xml, lib/jcl-over-slf4j-1.7.9.jar, lib/log4j-over-slf4j-1.7.9.jar, lib/slf4j-api-1.7.9.jar, lib/slf4j-jdk14-1.7.9.jar, nbproject/project.xml, pom.xml
Tue Dec 23 02:01:03 CET 2014
by reger
fix proxy redirect (http status 302) response
fixes http://mantis.tokeek.de/view.php?id=517

The url given in bug report uses a gzip input stream which causes the HTTPClient.writeto() throw an IOException due to incomplete input stream. This in turn prevents the 302 reponse to the client browser. 
By limiting to serve target content just on httpstatus=200 will proxy the header reponse and client browsers redirect settings can be honored.
Changed Files: source/net/yacy/http/ProxyHandler.java, source/net/yacy/server/http/HTTPDProxyHandler.java
Tue Dec 23 00:37:51 CET 2014
by Michael Peter Christen
enhanced initialization of autotagging
Changed Files: source/net/yacy/cora/lod/vocabulary/Tagging.java
Mon Dec 22 20:36:29 CET 2014
by reger
add hint to Heuristics Config on "Greedy Learning Mode" in portal config,
to point to a option to make this setting permanent.
Changed Files: htroot/ConfigHeuristics_p.html, htroot/ConfigPortal.html
Mon Dec 22 20:34:13 CET 2014
by reger
update to commons-fileupload-1.3.1.jar 
(includes a security fix)
Changed Files: .classpath, addon/YaCy.app/Contents/Info.plist, build.xml, lib/commons-fileupload-1.3.1.License, lib/commons-fileupload-1.3.1.jar, nbproject/project.xml, pom.xml
Sun Dec 21 19:17:06 CET 2014
by Michael Peter Christen
changed prefer strategy for http unique in such a way that http is
preferred over https. While this is a bad idea from the standpoint of
security it is more common applicable for environments where http and
https mix and for some domains https is not available. Then the
double-check is possible even if no postprocessing is performed.
Changed Files: defaults/yacy.init, source/net/yacy/search/Switchboard.java, source/net/yacy/search/schema/CollectionConfiguration.java
Sun Dec 21 19:08:28 CET 2014
by Michael Peter Christen
fix to prevent assertion error in ranking servlet if no vocabularies are
present that could be evaluated
Changed Files: htroot/RankingSolr_p.java
Sun Dec 21 17:31:51 CET 2014
by Michael Peter Christen
the miss cache does not seem to work, it sometimes contains urlhashes
from documents which actually are inside the index. This can be
reproduced using the crawl result table at 
http://localhost:8090/CrawlResults.html?process=5
The cache is temporary disabled to remove the bad behaviour, however a
later reactivation of that feater may be possible. 
Changed Files: defaults/yacy.init, source/net/yacy/cora/federate/solr/connector/ConcurrentUpdateSolrConnector.java
Sun Dec 21 14:02:06 CET 2014
by reger
fix refactored size() -> filesize() in YMarkMetadata
Changed Files: source/net/yacy/data/ymark/YMarkMetadata.java
Sun Dec 21 06:05:35 CET 2014
by reger
refactor size() -> filesize() of URIMetadataNode
(harmonize with ResultEntry and to not get confused with Collection.size())
Changed Files: htroot/ViewFile.java, htroot/api/yacydoc.java, source/net/yacy/kelondro/data/meta/URIMetadataNode.java, source/net/yacy/search/snippet/ResultEntry.java
Sun Dec 21 03:45:54 CET 2014
by reger
remove redundant caching of urlhash in URIMetadataNode
(is already cached in underlaying DigestURL .url)

upd pom keyword for maven-antrun-plugin
Changed Files: pom.xml, source/net/yacy/kelondro/data/meta/URIMetadataNode.java
Sat Dec 20 01:59:00 CET 2014
by reger
use peeraddress for link in remote crawl list
to make link work without enabled proxy

upd pom for Jetty (missing in last commit)
Changed Files: htroot/RemoteCrawl_p.html, htroot/RemoteCrawl_p.java, pom.xml
Fri Dec 19 17:41:38 CET 2014
by Michael Peter Christen
preventing the use of no-cache and expires in case that images are
generated dynamically which will stay static in the future. This applies
mainly to the search result favicon in front of search hits. These icons
will now be generated once, but then caches in the browser. There is
also a YaCy-internal cache for these icons which had prevented the
re-generation of the icons in YaCy, but this cache is now superfluous
since the browser should not call the servlet ViewImage again.
Changed Files: htroot/api/snapshot.java, htroot/osm.java, htroot/yacysearchitem.html, htroot/yacysearchitem.java, source/net/yacy/peers/graphics/EncodedImage.java
Fri Dec 19 17:38:58 CET 2014
by Michael Peter Christen
fixes for searches when initialization of large autotagging libraries
have not been finished
Changed Files: htroot/ViewImage.java, htroot/yacysearch.java, source/net/yacy/search/query/QueryParams.java
Fri Dec 19 17:37:58 CET 2014
by Michael Peter Christen
fixes to usage of no-cache: use and recognize also the no-store
directive
Changed Files: htroot/NetworkPicture.java, source/net/yacy/crawler/retrieval/Response.java, source/net/yacy/http/servlets/SolrSelectServlet.java, source/net/yacy/http/servlets/YaCyDefaultServlet.java, source/net/yacy/peers/SeedDB.java, source/net/yacy/search/Switchboard.java, source/net/yacy/server/http/HTTPDemon.java
Fri Dec 19 11:51:14 CET 2014
by Michael Peter Christen
reduction of http requests to YaCy using the correct cache-control,
expires and last-modified headers in http response.
Changed Files: source/net/yacy/http/servlets/YaCyDefaultServlet.java
Fri Dec 19 01:58:37 CET 2014
by reger
fix missing AppPath
upd Maven plugin versionid
Changed Files: pom.xml, source/net/yacy/search/Switchboard.java
Tue Dec 16 21:12:37 CET 2014
by reger
include xmpcore.jar in classpath
used by metadata-extractor
Changed Files: .classpath, addon/YaCy.app/Contents/Info.plist, build.xml, lib/xmpcore-5.1.2.license, nbproject/project.xml
Tue Dec 16 21:10:53 CET 2014
by malykhin.dmitry
Update russian translation
Changed Files: locales/ru.lng
Tue Dec 16 13:53:12 CET 2014
by Michael Peter Christen
added query modifier 'on'. This makes it possible to search for date
occurrences within the (web) page documents (not the document
last-modified!). This works only if the solr field dates_in_content_sxt
is enabled. A search request may then have the form "term on:<date>",
like
gift on:24.12.2014
gift on:2014/12/24
* on:2014/12/31
For the date format you may use any kind of human-readable date
representation(!yes!) - the on:<date> parser tries to identify language
and also knows event names, like:
bunny on:eastern
.. as long as the date term has no spaces inside (use a dot). Further
enhancement will be made to accept also strings encapsulated with
quotes.
Changed Files: source/net/yacy/document/DateDetection.java, source/net/yacy/search/query/QueryModifier.java, source/net/yacy/search/query/QueryParams.java
Tue Dec 16 13:18:49 CET 2014
by Michael Peter Christen
added (very experimental) Solr response writer for snapshot image
results
Changed Files: source/net/yacy/cora/federate/solr/responsewriter/HTMLResponseWriter.java, source/net/yacy/cora/federate/solr/responsewriter/SnapshotImagesReponseWriter.java, source/net/yacy/http/servlets/SolrSelectServlet.java
Tue Dec 16 12:39:10 CET 2014
by Michael Peter Christen
added url, date, time and page number on pdf snapshot footer
Changed Files: source/net/yacy/cora/util/Html2Image.java
Tue Dec 16 12:09:57 CET 2014
by Michael Peter Christen
reactivated on-demand snapshot loading
Changed Files: htroot/api/snapshot.java, source/net/yacy/crawler/data/Transactions.java, source/net/yacy/search/index/Segment.java
Mon Dec 15 22:54:49 CET 2014
by reger
add final SolrQueryRequest.close to SolrServlet
Changed Files: source/net/yacy/http/servlets/SolrServlet.java
Mon Dec 15 05:56:12 CET 2014
by Michael Peter Christen
added a note that the servlet is linked using web.xml
Changed Files: source/net/yacy/http/servlets/UrlProxyServlet.java
Sun Dec 14 21:27:45 CET 2014
by reger
- fix path to default heuristic.cfg
- deprecate unused ProxyServlet
Changed Files: htroot/ConfigHeuristics_p.java, source/net/yacy/http/servlets/YaCyProxyServlet.java
Sun Dec 14 19:17:13 CET 2014
by reger
add chardet.jar to Maven dependencies
Changed Files: nbproject/project.xml, pom.xml
Sun Dec 14 19:12:18 CET 2014
by reger
fix yacy.init comment
http://mantis.tokeek.de/view.php?id=513
Changed Files: defaults/yacy.init
Sun Dec 14 13:43:30 CET 2014
by Michael Peter Christen
add the actual DateDetection class... (missed in latest commit)
Changed Files: source/net/yacy/document/DateDetection.java
Sun Dec 14 04:02:13 CET 2014
by Michael Peter Christen
enable sku as anchor in html response writer
Changed Files: defaults/solr.collection.schema, source/net/yacy/cora/federate/solr/responsewriter/HTMLResponseWriter.java
Sat Dec 13 09:54:41 CET 2014
by Michael Peter Christen
enhanced tagging preparation speed which reduces initialization time for
very large vocabularies
Changed Files: source/net/yacy/cora/lod/vocabulary/Tagging.java
Thu Dec 11 23:37:41 CET 2014
by Michael Peter Christen
refactoring date -> lastModified
Changed Files: source/net/yacy/document/Document.java, source/net/yacy/search/index/Segment.java, source/net/yacy/search/schema/CollectionConfiguration.java
Wed Dec 10 14:10:05 CET 2014
by Michael Peter Christen
added concurrent generation of snapshot pdfs
Changed Files: source/net/yacy/crawler/data/Snapshots.java, source/net/yacy/crawler/data/Transactions.java
Wed Dec 10 13:11:51 CET 2014
by Michael Peter Christen
added charset detection to vocabulary reader
Changed Files: htroot/Vocabulary_p.java
Wed Dec 10 13:08:29 CET 2014
by Michael Peter Christen
added character set detection library from
http://www-archive.mozilla.org/projects/intl/chardet.html
Changed Files: .classpath, addon/YaCy.app/Contents/Info.plist, build.xml, lib/chardet.License, lib/chardet.jar, source/net/yacy/kelondro/util/FileUtils.java
Wed Dec 10 12:20:27 CET 2014
by Michael Peter Christen
added new options to vocabulary editor:
- new switch 'isFacet' which causes that the usage of the vocabulary for
search facets is enabled or disabled. This shall be used for large
vocabularies sind searched in solr are extremely slow if facets for a
large set of alternative terms are generated
- new option to disable auto-enrichment from synonyms
- new option to add synonyms from another column when importing from csv
- automatically recognize double-occurrences in synonyms and bundling
terms for such synonyms
Changed Files: htroot/Vocabulary_p.html, htroot/Vocabulary_p.java, source/net/yacy/cora/lod/vocabulary/Tagging.java, source/net/yacy/search/query/QueryParams.java
Tue Dec 09 00:58:08 CET 2014
by reger
remove redundant null check in ResponseHeader.lastModified
added a JUnit testcase for ResponseHeader dates (using age()),
adjusted age() to pass all tests
Changed Files: source/net/yacy/cora/protocol/ResponseHeader.java, test/net/yacy/cora/protocol/ResponseHeaderTest.java
Mon Dec 08 11:41:28 CET 2014
by Michael Peter Christen
added confirmation dialogs for row deletion
Changed Files: htroot/Table_API_p.html
Mon Dec 08 11:35:40 CET 2014
by Michael Peter Christen
more robustness for broken table data in Table_API_p.html -- see bug
report http://mantis.tokeek.de/view.php?id=495
Changed Files: htroot/Table_API_p.java
Sun Dec 07 23:43:38 CET 2014
by Michael Peter Christen
enhancement for clearing the crawl queue 
Changed Files: htroot/Crawler_p.java
Sun Dec 07 04:31:09 CET 2014
by reger
modified FieldReIndex to reindex queries with low number of documents first
by using a internally a score map with number of documents as score
and working through the list from low to high.
Changed Files: htroot/IndexReIndexMonitor_p.html, htroot/IndexReIndexMonitor_p.java, source/net/yacy/search/index/ReindexSolrBusyThread.java
Sat Dec 06 22:32:24 CET 2014
by reger
update to commons-logging-1.2
Changed Files: .classpath, addon/YaCy.app/Contents/Info.plist, build.xml, lib/commons-logging-1.2.License, lib/commons-logging-1.2.jar, nbproject/project.xml, pom.xml
Sat Dec 06 01:44:03 CET 2014
by reger
Merge origin/master
Changed Files: htroot/api/snapshot.java, source/net/yacy/cora/document/feed/RSSFeed.java, source/net/yacy/cora/document/feed/RSSMessage.java, source/net/yacy/crawler/data/Snapshots.java
Sat Dec 06 01:42:24 CET 2014
by reger
coding fixes suggested in 
http://mantis.tokeek.de/view.php?id=509
http://mantis.tokeek.de/view.php?id=510
Changed Files: source/net/yacy/search/schema/CollectionConfiguration.java, source/net/yacy/server/http/HTTPDProxyHandler.java
Sat Dec 06 00:25:05 CET 2014
by Michael Peter Christen
added rss feed output to snapshot servlet which can be used to get a
list of latest/oldest entries in the snapshot database. This is an
example:
http://localhost:8090/api/snapshot.rss?depth=2&order=LATESTFIRST&host=yacy.net&maxcount=100

The properties depth, order, host and maxcount can be omited. The
meaning of the fields are:
host: select only urls from this host or all, if not given
depth: select only urls at that crawl depth or all, if not given
maxcount: select at most the given number of urls or 10, if not given
order: either LATESTFIRST to select the youngest entries, OLDESTFIRST to
select the first entries or ANY to select any

The rss feed needs administration rights to work, a call to this servlet
with rss extension must attach login credentials.
Changed Files: htroot/api/snapshot.java, source/net/yacy/crawler/data/Snapshots.java
Sat Dec 06 00:18:14 CET 2014
by Michael Peter Christen
added toString() methods to feed classes which makes it possible to
export full rss feed files out of the RSSFeed class
Changed Files: source/net/yacy/cora/document/feed/RSSFeed.java, source/net/yacy/cora/document/feed/RSSMessage.java
Fri Dec 05 03:03:28 CET 2014
by reger
remove the unused Request variable
(fix of  prev. commit)
Changed Files: source/net/yacy/crawler/retrieval/Request.java
Fri Dec 05 01:15:41 CET 2014
by reger
Merge origin/master
Changed Files: htroot/CrawlStartExpert.java, source/net/yacy/cora/util/Html2Image.java, source/net/yacy/kelondro/util/OS.java
Thu Dec 04 01:21:24 CET 2014
by Michael Peter Christen
added Image Events as another option to generate images with a mac if no
Ghostscript is available or does not work...
Changed Files: source/net/yacy/cora/util/Html2Image.java, source/net/yacy/kelondro/util/OS.java
Wed Dec 03 18:07:05 CET 2014
by Michael Peter Christen
added another path for the convert command because on older Macs
ImageMagick has a different installation location
Changed Files: htroot/CrawlStartExpert.java, source/net/yacy/cora/util/Html2Image.java
Tue Dec 02 21:03:00 CET 2014
by reger
skip creation of unused Bluelist contenttransformer
Changed Files: source/net/yacy/document/parser/html/AbstractTransformer.java, source/net/yacy/document/parser/html/ContentTransformer.java, source/net/yacy/document/parser/html/TransformerWriter.java, source/net/yacy/server/http/HTTPDProxyHandler.java
Tue Dec 02 16:21:06 CET 2014
by Michael Peter Christen
showing list of all thread in threaddump using the ThreadMXBean counter
(this obviously show more threads than before?)
Changed Files: htroot/Threaddump_p.java
Tue Dec 02 16:05:00 CET 2014
by Michael Peter Christen
set Busy- and Blocking-Threads to daemon mode (they will now not prevent
YaCy from termination if still running)
Changed Files: source/net/yacy/kelondro/workflow/AbstractBlockingThread.java, source/net/yacy/kelondro/workflow/AbstractBusyThread.java, source/net/yacy/kelondro/workflow/AbstractThread.java, source/net/yacy/kelondro/workflow/InstantBlockingThread.java
Tue Dec 02 16:04:11 CET 2014
by Michael Peter Christen
show number of threads on status page
Changed Files: htroot/Status.java, htroot/Status_p.inc
Tue Dec 02 13:35:19 CET 2014
by Michael Peter Christen
in case that loading from the cache fails, load from wkhtmltopdf without
cache using the user agent string given in the crawl profile
Changed Files: source/net/yacy/cora/util/Html2Image.java, source/net/yacy/crawler/data/Snapshots.java, source/net/yacy/repository/LoaderDispatcher.java
Tue Dec 02 12:52:36 CET 2014
by Michael Peter Christen
recognize more html file types for snapshots
Changed Files: source/net/yacy/repository/LoaderDispatcher.java
Tue Dec 02 12:52:05 CET 2014
by Michael Peter Christen
get cloned crawl start parameter for snapshots
Changed Files: htroot/CrawlStartExpert.html, htroot/CrawlStartExpert.java
Tue Dec 02 12:10:44 CET 2014
by Michael Peter Christen
recognize more html file extensions
Changed Files: source/net/yacy/document/parser/htmlParser.java
Tue Dec 02 11:51:12 CET 2014
by Michael Peter Christen
fix to xvfb-run usage (quotes did not parse in xvfb-run, default values
are appropriate)
Changed Files: source/net/yacy/cora/util/Html2Image.java
Mon Dec 01 18:21:52 CET 2014
by Michael Peter Christen
added fail-over missing http proxy service (i.e. overload) and quiet
mode
Changed Files: source/net/yacy/cora/util/Html2Image.java
Mon Dec 01 17:37:25 CET 2014
by Michael Peter Christen
moved snapshot generation out of the html handler to prevent that
existing cache entries cause that the handler is not executed 
Changed Files: source/net/yacy/cora/util/Html2Image.java, source/net/yacy/crawler/retrieval/HTTPLoader.java, source/net/yacy/repository/LoaderDispatcher.java
Mon Dec 01 16:50:37 CET 2014
by Michael Peter Christen
more logging
Changed Files: source/net/yacy/cora/util/Html2Image.java
Mon Dec 01 16:38:07 CET 2014
by Michael Peter Christen
grr
Changed Files: source/net/yacy/cora/util/Html2Image.java
Mon Dec 01 16:26:28 CET 2014
by Michael Peter Christen
wrap wkhtmltopdf with xvfb if necessary
Changed Files: source/net/yacy/cora/util/Html2Image.java
Mon Dec 01 16:00:45 CET 2014
by Michael Peter Christen
more logging when failing to create pdf snapshot
Changed Files: source/net/yacy/cora/util/Html2Image.java
Mon Dec 01 15:20:10 CET 2014
by Michael Peter Christen
added the property timeoutrequests to configuration to disable
TimeoutRequests. The purpose is to test if YaCy runs better on VMs where
there is a limitation of concurrent processes;  see
/proc/user_beancounters in row numproc; this value is limited and should
be low. Try to set timeoutrequests to keep this low. (works only after
restart)
Changed Files: defaults/yacy.init, source/net/yacy/cora/protocol/TimeoutRequest.java, source/net/yacy/search/Switchboard.java
Mon Dec 01 01:12:51 CET 2014
by Michael Peter Christen
moved network configuration to Use Case submenu; this is necessary
because the definiton of portal peers within the YaCy freeworld network
is otherwise splitted into two different main menus.
Changed Files: htroot/ConfigNetwork_p.html, htroot/env/templates/submenuConfig.template, htroot/env/templates/submenuUseCaseAccount.template
Mon Dec 01 00:21:30 CET 2014
by reger
replace depreciated Solr DateField.formatExternal with recommended TrieDateField.formatExternal
Changed Files: source/net/yacy/cora/federate/solr/responsewriter/EnhancedXMLResponseWriter.java, source/net/yacy/cora/federate/solr/responsewriter/HTMLResponseWriter.java
Sun Nov 30 19:43:53 CET 2014
by reger
update to guava.18.0.jar and jsch.0.1.51.jar
Changed Files: .classpath, addon/YaCy.app/Contents/Info.plist, build.xml, lib/guava-18.0.jar, lib/jsch-0.1.51.License, lib/jsch-0.1.51.jar, nbproject/project.xml, pom.xml
Sun Nov 30 19:42:33 CET 2014
by reger
skip unused call parameter for hashSentence()
Changed Files: source/net/yacy/document/SnippetExtractor.java, source/net/yacy/document/WordTokenizer.java, source/net/yacy/search/snippet/MediaSnippet.java, source/net/yacy/search/snippet/TextSnippet.java
Sun Nov 30 01:58:14 CET 2014
by reger
position api icon (ViewFile.html)
Changed Files: htroot/ViewFile.html
Sat Nov 29 22:36:02 CET 2014
by reger
update to poi-3.10.1.jar
Changed Files: .classpath, addon/YaCy.app/Contents/Info.plist, build.xml, lib/poi-3.10.1.License, lib/poi-3.10.1.jar, lib/poi-scratchpad-3.10.1.License, lib/poi-scratchpad-3.10.1.jar, nbproject/project.xml, pom.xml
Sat Nov 29 22:13:24 CET 2014
by reger
including small junit test case for WordTokenizer
Changed Files: test/net/yacy/document/WordTokenizerTest.java
Sat Nov 29 17:16:05 CET 2014
by reger
skip to tokenize punktuation as word in WordTokenizer
remove unused variables in condenser related to Tokenizer
Changed Files: source/net/yacy/document/Condenser.java, source/net/yacy/document/WordTokenizer.java
Sat Nov 29 15:27:16 CET 2014
by reger
add. use host port parameter in YaCyApp
Changed Files: source/net/yacy/gui/InfoPage.java, source/net/yacy/gui/YaCyApp.java, source/net/yacy/gui/framework/Switchboard.java
Sat Nov 29 03:09:55 CET 2014
by reger
adjust translation text of  error msg on empty query
(ru: needs correction)
Changed Files: locales/de.lng, locales/ru.lng
Fri Nov 28 01:40:46 CET 2014
by reger
remove obsolete alternate link
fix api link
Changed Files: htroot/Table_API_p.html
Fri Nov 28 01:25:52 CET 2014
by Michael Peter Christen
added image screenshot generator
Changed Files: source/net/yacy/cora/util/Html2Image.java
Thu Nov 27 20:50:55 CET 2014
by Michael Peter Christen
ignore url errors during search
Changed Files: source/net/yacy/search/query/SearchEvent.java
Thu Nov 27 12:13:20 CET 2014
by Michael Peter Christen
disabled postprocessing by default. If you read this: please disable
postprocessing in your peer as well: open /IndexSchema_p.html, then
deselect field process_sxt
Changed Files: defaults/solr.collection.schema
Thu Nov 27 12:11:54 CET 2014
by Michael Peter Christen
larger boost fields for ranking
Changed Files: htroot/RankingSolr_p.html
Thu Nov 27 08:08:05 CET 2014
by Michael Peter Christen
bold words in snippets should not be coloured black in the base style
because there are styles with dark backgrounds which make the bold word
invisible
Changed Files: htroot/env/base.css
Thu Nov 27 07:44:41 CET 2014
by Michael Peter Christen
changed vocabulary navigator object type to TreeMap to get a specific
order into the vocabularies. This is now lexicographic which is not so
much random as a hashed order
Changed Files: source/net/yacy/search/query/SearchEvent.java
Wed Nov 26 18:01:35 CET 2014
by Michael Peter Christen
added option to change the navbar-default, i.e. usable for dark skins
Changed Files: defaults/yacy.init, htroot/env/templates/simpleheader.template, source/net/yacy/http/servlets/YaCyDefaultServlet.java
Tue Nov 25 23:11:42 CET 2014
by Michael Peter Christen
trying facet.method fc instead of fcs to handle large facets
Changed Files: source/net/yacy/cora/federate/solr/connector/AbstractSolrConnector.java
Mon Nov 24 20:53:19 CET 2014
by Michael Peter Christen
prevent that a local Solr search and a local RWI search are running
concurrently. When a RWI search result is flushed into the result set,
id does Solr Queries (which replaced the old-style Metadata Queries) and
they are possibly running concurrently to a previously startet Solr
search. Both methods may block each other with IO. To enhance the speed,
they are now serialized. Because the Solr search results may result in
better results using the more advanced and configurable Ranking methods,
this result is preverred over the RWI search result. However, remote RWI
search results are still feeded concurrently into the search result as
well.
Changed Files: source/net/yacy/search/query/SearchEvent.java
Sun Nov 23 23:29:20 CET 2014
by reger
fix path lookup to ./defaults/yacy.badwords
(fix of commit https://gitorious.org/yacy/rc1/commit/ee277b9b3e033e8261a97b8334b79059220f113a)
Changed Files: source/net/yacy/search/Switchboard.java
Sun Nov 23 23:12:01 CET 2014
by reger
fix empty text facet entry
(noticed on Author facet)
Changed Files: source/net/yacy/peers/Protocol.java
Sun Nov 23 20:11:23 CET 2014
by Michael Peter Christen
more stacks shall be considered for on-demand loading, not only
deep-depth stacks to prevent "too many open files" problem
Changed Files: source/net/yacy/crawler/HostQueue.java
Sun Nov 23 20:09:32 CET 2014
by Michael Peter Christen
reduce number of calls to queue.size() because that may be a bottleneck
during crawling
Changed Files: htroot/yacy/urls.java, source/net/yacy/crawler/HostBalancer.java, source/net/yacy/crawler/data/CrawlQueues.java, source/net/yacy/crawler/data/NoticedURL.java
Sun Nov 23 20:07:32 CET 2014
by Michael Peter Christen
optimize usage of size() cache
Changed Files: source/net/yacy/kelondro/index/OnDemandOpenFileIndex.java
Sun Nov 23 05:22:23 CET 2014
by reger
allow for local yacy.stopwords and yacy.badwords list (in DATA/SETTINGS/)
if file in DATA/SETTINGS it is loaded otherwise file in ./defaults is loaded
   (if locale ./defaults/stopwords.xx doesn't exist take solr/lang/stopwords_xx.txt as default)

move yacy.stopwords, yacy.stopwords.de and yacy.badwords.example out of root directory to ./defaults directory

Changed Files: defaults/yacy.badwords.example, defaults/yacy.stopwords, defaults/yacy.stopwords.de, source/net/yacy/search/Switchboard.java
Sat Nov 22 22:49:23 CET 2014
by reger
remove redundant toLower for topwords
Changed Files: source/net/yacy/search/query/SearchEvent.java
Sat Nov 22 12:09:07 CET 2014
by Michael Peter Christen
better delete all files in path when removing host crawl stack
Changed Files: source/net/yacy/crawler/HostBalancer.java
Sat Nov 22 12:04:04 CET 2014
by Michael Peter Christen
if we have many hosts, use on-demand earlier
Changed Files: source/net/yacy/crawler/HostQueue.java
Sat Nov 22 12:01:00 CET 2014
by Michael Peter Christen
prevent division by zero
Changed Files: source/net/yacy/visualization/ChartPlotter.java
Fri Nov 21 14:38:54 CET 2014
by Michael Peter Christen
disabled crazy sleep loop
Changed Files: source/net/yacy/kelondro/util/FileUtils.java
Fri Nov 21 12:42:29 CET 2014
by Michael Peter Christen
when importing vocabulary csv files, accept also files without semicolon
and truncate quotes from literals
Changed Files: htroot/Vocabulary_p.java
Thu Nov 20 18:46:06 CET 2014
by Michael Peter Christen
added hints to ranking to make ranking boosts using vocabularies easier
Changed Files: htroot/RankingSolr_p.html, htroot/RankingSolr_p.java
Thu Nov 20 18:45:27 CET 2014
by Michael Peter Christen
do not cache search requests to Solr if the result is used for
doublechecking. If a double-check comes from cached results the
doublecheck fails.
Changed Files: htroot/IndexDeletion_p.java, source/net/yacy/search/Switchboard.java, source/net/yacy/search/index/Fulltext.java, source/net/yacy/search/index/Segment.java, source/net/yacy/search/schema/CollectionConfiguration.java
Thu Nov 20 18:44:29 CET 2014
by Michael Peter Christen
use a LinkedHashMap for factes to maintain facet order as given by solr
Changed Files: htroot/HostBrowser.java, source/net/yacy/cora/federate/solr/connector/AbstractSolrConnector.java, source/net/yacy/cora/federate/solr/connector/CachedSolrConnector.java, source/net/yacy/cora/federate/solr/connector/ConcurrentUpdateSolrConnector.java, source/net/yacy/cora/federate/solr/connector/EmbeddedSolrConnector.java, source/net/yacy/cora/federate/solr/connector/MirrorSolrConnector.java, source/net/yacy/cora/federate/solr/connector/SolrConnector.java
Thu Nov 20 02:04:43 CET 2014
by reger
include domtype to searcheventcache id 
to differenciate between local / global events for reuse of cached events 
fix for http://mantis.tokeek.de/view.php?id=493
Changed Files: source/net/yacy/search/query/QueryParams.java
Wed Nov 19 18:12:43 CET 2014
by Michael Peter Christen
added option to enrich vocabularies with synonyms from synonym database
Changed Files: htroot/Vocabulary_p.html, htroot/Vocabulary_p.java, source/net/yacy/cora/language/synonyms/SynonymLibrary.java
Tue Nov 18 15:02:34 CET 2014
by Michael Peter Christen
added new solr schema fields which record the occurences of vocabulary
matchings. These matches can be used for result boosting, i.e. if a
document contains words from a specific vocabulary, boost it.
Changed Files: defaults/solr.collection.schema, source/net/yacy/migration.java, source/net/yacy/search/query/QueryParams.java, source/net/yacy/search/query/SearchEvent.java, source/net/yacy/search/schema/CollectionConfiguration.java, source/net/yacy/search/schema/CollectionSchema.java
Mon Nov 17 14:23:21 CET 2014
by Michael Peter Christen
fix for 2-day network stats table: showing 48 instead of 24 hours from
peer history
Changed Files: htroot/Network.html
Mon Nov 17 14:22:40 CET 2014
by Michael Peter Christen
added option in vocabulary editor to import CSV files with different
encodings (preselected windows-type character encoding which is typical
for CSV files). Fixed also other problems with character encoding in
dictionary files. Automatically generated vocabularies are now also
noted in the API steering.
Changed Files: htroot/Vocabulary_p.html, htroot/Vocabulary_p.java, source/net/yacy/cora/lod/vocabulary/Tagging.java
Mon Nov 17 01:24:30 CET 2014
by reger
adjust tag cloud font size calculation 
to limit max font size to ~ TOPWORDS_MAXSIZE 
Changed Files: htroot/yacysearchtrailer.java
Sun Nov 16 01:26:07 CET 2014
by reger
add a check of java version string >=1.7 to startup class
stopping start with error msg on version < 1.7
Changed Files: source/net/yacy/yacy.java
Fri Nov 14 16:34:55 CET 2014
by Michael Peter Christen
added fix to postprocessing: avoid caching of postprocessing collection
to always get fresh lists of documents. This is necessary since the
postprocessing changes the same documents which the
postprocessing-collection query selects.
Changed Files: htroot/Vocabulary_p.html, htroot/api/status_p.java, source/net/yacy/search/schema/CollectionConfiguration.java
Fri Nov 14 10:02:50 CET 2014
by Michael Peter Christen
added high-precision scheduler for API processes. This allows also to
make the execution in dependency of available RAM or CPU load. The
default value for CPU load is 4.0 and the check runs once a minute.
Changed Files: defaults/yacy.init, htroot/Table_API_p.java, source/net/yacy/search/Switchboard.java, source/net/yacy/search/SwitchboardConstants.java
Thu Nov 13 01:30:12 CET 2014
by Michael Peter Christen
added missing class for latest changes
Changed Files: source/net/yacy/kelondro/table/IndexTable.java
Thu Nov 13 01:15:31 CET 2014
by Michael Peter Christen
fix in key enumeration methods for cases where the enumeration is done
in reverse order.
Changed Files: source/net/yacy/kelondro/index/BufferedObjectIndex.java, source/net/yacy/kelondro/index/RAMIndex.java
Wed Nov 12 21:32:34 CET 2014
by sixcooler
added a input-field for setting 'fileHost'
Set this to avoid error-messages like 'proxy use not allowed / granted'
on accessing your Peer by its hostname.
Changed Files: htroot/SettingsAck_p.java, htroot/Settings_ServerAccess.inc, htroot/Settings_p.java
Tue Nov 11 13:57:04 CET 2014
by Michael Peter Christen
another fix to ordering of table indexes; fixes also network stats
graphics
Changed Files: source/net/yacy/kelondro/index/RAMIndex.java, source/net/yacy/kelondro/index/RAMIndexCluster.java, source/net/yacy/kelondro/table/SplitTable.java, source/net/yacy/kelondro/util/StackIterator.java
Sun Nov 09 22:06:00 CET 2014
by reger
remove not used accordion javascript call for facet navs
Changed Files: htroot/yacysearchtrailer.html, htroot/yacysearchtrailer.java
Sun Nov 09 04:17:14 CET 2014
by reger
skip creation of local var in proxyhandler.storetocache
Changed Files: source/net/yacy/http/ProxyHandler.java
Sat Nov 08 21:10:10 CET 2014
by reger
upd NB project.xml to codec-1.9
Changed Files: nbproject/project.xml
Fri Nov 07 22:43:50 CET 2014
by sixcooler
fix assertation-failure in version-string for Solr-4.10.2 by changing
the assert - hope that is ok
+ add forgotten NB-Projekt-changes
Changed Files: nbproject/project.xml, source/net/yacy/search/index/Fulltext.java
Sun Nov 02 21:16:51 CET 2014
by orbiter
fix for search in case where local peer has no local seed address in
portal mode
Changed Files: source/net/yacy/peers/RemoteSearch.java
Sun Nov 02 20:30:49 CET 2014
by orbiter
added reverse button to tables, by default on now (to see latest entries
first)
Changed Files: htroot/Tables_p.html, htroot/Tables_p.java
Sun Nov 02 20:10:32 CET 2014
by orbiter
added (missing) Tables_p.xml for table xml api
Changed Files: htroot/Tables_p.xml
Sun Nov 02 20:08:49 CET 2014
by orbiter
removed unused options from BusyThreads
Changed Files: source/net/yacy/kelondro/workflow/AbstractBusyThread.java, source/net/yacy/kelondro/workflow/InstantBusyThread.java, source/net/yacy/search/Switchboard.java, source/net/yacy/search/index/ReindexSolrBusyThread.java
Sun Nov 02 12:52:23 CET 2014
by Michael Peter Christen
more enhancements to posprocessing speed
Changed Files: source/net/yacy/cora/federate/solr/connector/AbstractSolrConnector.java, source/net/yacy/cora/federate/solr/connector/ConcurrentUpdateSolrConnector.java, source/net/yacy/cora/federate/solr/connector/SolrConnector.java, source/net/yacy/search/schema/CollectionConfiguration.java
Fri Oct 31 17:44:45 CET 2014
by Michael Peter Christen
another fix for postprocessing (the query for "" on numeric field did
not work in external solr)
Changed Files: source/net/yacy/search/schema/CollectionConfiguration.java
Fri Oct 31 17:30:24 CET 2014
by Michael Peter Christen
more fixes in postprocessing: partitioning of the complete queue to
enable smaller queries
Changed Files: source/net/yacy/cora/federate/solr/connector/AbstractSolrConnector.java, source/net/yacy/cora/federate/solr/connector/SolrServerConnector.java, source/net/yacy/search/schema/CollectionConfiguration.java
Thu Oct 30 21:52:52 CET 2014
by orbiter
more concurrency for postprocessing
Changed Files: source/net/yacy/cora/federate/solr/connector/AbstractSolrConnector.java
Thu Oct 30 18:05:48 CET 2014
by orbiter
enhanced postprocessing by usage of a field-list generation to prevent
lazy initialization of the documents. This is useful because the
documents must be read completely anyway.
Changed Files: source/net/yacy/cora/federate/solr/connector/AbstractSolrConnector.java, source/net/yacy/search/schema/CollectionConfiguration.java
Wed Oct 29 21:41:41 CET 2014
by Michael Peter Christen
better scaling of network statistic graphs
Changed Files: htroot/NetworkHistory.java
Wed Oct 29 17:23:58 CET 2014
by orbiter
replaced old /api/table_p.xml servlet with /Tables_p.xml to avoid double
code
Changed Files: htroot/Tables_p.html, htroot/Tables_p.java
Wed Oct 29 13:37:44 CET 2014
by Michael Peter Christen
added new index size history image in /Status.html page
Changed Files: htroot/NetworkHistory.java, htroot/Status.html, source/net/yacy/visualization/ChartPlotter.java
Wed Oct 29 13:21:35 CET 2014
by Michael Peter Christen
added network history in /Network.html?page=5
Changed Files: htroot/Network.html, htroot/Network.java, htroot/NetworkHistory.java, htroot/Tables_p.java
Wed Oct 29 10:50:08 CET 2014
by Michael Peter Christen
added debug code for statistics about document attributes related to
domains
Changed Files: defaults/yacy.init, htroot/HostBrowser.html, htroot/HostBrowser.java
Sun Oct 26 23:33:21 CET 2014
by reger
RankingSolr: display only available or configured boost fields
Changed Files: htroot/RankingSolr_p.html, htroot/RankingSolr_p.java
Fri Oct 24 12:57:37 CEST 2014
by Michael Peter Christen
replaced input text field with text field for index deletion with query
and replaced GET with POST method. This should make it possible to
tubmit here very large queries for deletion.
Changed Files: .gitignore, htroot/IndexDeletion_p.html
Fri Oct 24 12:32:44 CEST 2014
by sixcooler
bump to httpcore-4.3.3
Changed Files: .classpath, addon/YaCy.app/Contents/Info.plist, build.xml, lib/dependencies.txt, lib/httpcore-4.3.3.License, lib/httpcore-4.3.3.jar, nbproject/project.xml, pom.xml
Mon Oct 20 18:05:37 CEST 2014
by orbiter
removed spaces in seedlist.xml to reduce data
Changed Files: htroot/yacy/seedlist.xml
Fri Oct 17 14:17:49 CEST 2014
by Michael Peter Christen
added field postprocessing.partialUpdate to settings which can be used
to switch on or off partial updates. Both options should cause the same
result. Default is on.
Changed Files: defaults/yacy.init, source/net/yacy/search/Switchboard.java, source/net/yacy/search/schema/CollectionConfiguration.java
Fri Oct 17 13:25:17 CEST 2014
by Michael Peter Christen
fix for a ssl bug that appear only in java 7.
The bug was reported in
http://forum.yacy-websuche.de/viewtopic.php?f=23&t=5407&p=30956#p30956
a solution was described in
http://teknosrc.com/javax-net-ssl-sslprotocolexception-handshake-alert-unrecognized_name-solved/
which worked for this example given in the yacy forum
Changed Files: source/net/yacy/yacy.java
Fri Oct 17 12:45:26 CEST 2014
by Michael Peter Christen
concurrently initialize the error cache; extended also the cache by
factor 10 up to 1000 entries. This error cache is only used to catch up
paused crawls between shutdown+startup
Changed Files: source/net/yacy/search/index/ErrorCache.java
Fri Oct 17 12:44:28 CEST 2014
by Michael Peter Christen
ipv6 fix for api /yacy/seedlist.[json|xml], multiple IPs are now
attached to the seed info. API clients must be adopted. Documentation
will be fixed in
http://www.yacy-websuche.de/wiki/index.php/Dev:APIseedlist

Also added a new retrieval option for seeds, they can now be retrieved
by their name with the get parameter name=<name>
Changed Files: htroot/yacy/seedlist.java, htroot/yacy/seedlist.json
Thu Oct 16 20:36:12 CEST 2014
by sixcooler
added a timeout on Jetty connectors
Changed Files: source/net/yacy/http/Jetty9HttpServerImpl.java
Wed Oct 15 18:13:54 CEST 2014
by sixcooler
do not overwrite yacy.conf in case of an exception
may be a fix for http://mantis.tokeek.de/view.php?id=180
Changed Files: source/net/yacy/kelondro/util/FileUtils.java
Wed Oct 15 11:19:25 CEST 2014
by Michael Peter Christen
removed warnings
Changed Files: htroot/Table_API_p.java, htroot/Table_YMark_p.java, htroot/Tables_p.java, htroot/api/table_p.java, source/net/yacy/document/parser/augment/AugmentParser.java, source/net/yacy/search/Switchboard.java
Wed Oct 15 11:07:08 CEST 2014
by Michael Peter Christen
automatically zoom to location/POI
Changed Files: htroot/yacysearch_location.html
Wed Oct 15 10:31:24 CEST 2014
by orbiter
enhanced graphics computation (avoiding long string parsing for colours)
Changed Files: htroot/NetworkHistory.java, htroot/NetworkPicture.java, htroot/imagetest.java, source/net/yacy/dbtest.java, source/net/yacy/peers/graphics/NetworkGraph.java, source/net/yacy/peers/graphics/ProfilingGraph.java, source/net/yacy/visualization/ChartPlotter.java
Wed Oct 15 09:13:23 CEST 2014
by orbiter
added proper copyright notice to OSM tiles presented at the search
result page
Changed Files: htroot/osm.java, source/net/yacy/peers/graphics/OSMTile.java
Wed Oct 15 00:55:57 CEST 2014
by Michael Peter Christen
enhanced location search
Changed Files: htroot/env/grafics/earthsearch.png, htroot/yacysearch_location.html, htroot/yacysearch_location.java, htroot/yacysearchtrailer.html
Wed Oct 15 00:55:42 CEST 2014
by Michael Peter Christen
better profiling of solr queries
Changed Files: source/net/yacy/cora/federate/solr/connector/SolrServerConnector.java
Mon Oct 13 23:51:19 CEST 2014
by Michael Peter Christen
added new solr field url_paths_count_i which can be used to enhance the
index browser and maybe also for ranking; possibly also for
SEO-with-YaCy applications.
Changed Files: defaults/solr.collection.schema, source/net/yacy/search/schema/CollectionConfiguration.java, source/net/yacy/search/schema/CollectionSchema.java
Mon Oct 13 18:33:39 CEST 2014
by Michael Peter Christen
make browsing of file://z: - paths in index browser easier - this will
now show the root paths on a shared drive
Changed Files: htroot/HostBrowser.java
Mon Oct 13 16:51:27 CEST 2014
by Michael Peter Christen
fix-fix for
https://gitorious.org/yacy/rc1/commit/30d4402cd1bbd5629d23562178a049ef7c3b25e9
Changed Files: source/net/yacy/cora/document/feed/RSSMessage.java
Sun Oct 12 06:32:13 CEST 2014
by reger
add filter to citation page and a on/off button
to display only sentences with citations,
while maintaining the sentence number.
Make the filtered list the default in search result citation link
Changed Files: htroot/api/citation.html, htroot/api/citation.java, htroot/yacysearchitem.html
Sat Oct 11 09:02:12 CEST 2014
by Michael Peter Christen
explain crawl denial when not switched to intranet mode
Changed Files: source/net/yacy/crawler/CrawlStacker.java
Fri Oct 10 14:40:31 CEST 2014
by Michael Peter Christen
slightly enhanced Network table computation by using a lazy initialized
bitfield for peer flags
Changed Files: source/net/yacy/peers/Seed.java
Fri Oct 10 14:32:21 CEST 2014
by Michael Peter Christen
refactoring (class name should start with uppercase letter)
Changed Files: htroot/NetworkHistory.java, source/net/yacy/peers/Seed.java, source/net/yacy/utils/Bitfield.java
Fri Oct 10 14:16:16 CEST 2014
by Michael Peter Christen
added also the NetworkHistory servlet...
Changed Files: htroot/NetworkHistory.java
Fri Oct 10 14:06:47 CEST 2014
by Michael Peter Christen
added network history graph image /NetworkHistory.png which can show
many different statistics about the history of the peer.
Changed Files: source/net/yacy/dbtest.java, source/net/yacy/kelondro/blob/BEncodedHeap.java, source/net/yacy/kelondro/blob/Tables.java, source/net/yacy/peers/graphics/ProfilingGraph.java, source/net/yacy/visualization/ChartPlotter.java
Thu Oct 09 13:31:36 CEST 2014
by Marc Nause
Minor changes:

*) reduced visibility of a method
*) updated comments
Changed Files: source/net/yacy/utils/upnp/UPnP.java
Thu Oct 09 13:27:20 CEST 2014
by Michael Peter Christen
fix for values in CrawlProfileEditor table and xml; now the full profile
is available in the xml.
Changed Files: htroot/CrawlProfileEditor_p.html, htroot/CrawlProfileEditor_p.xml, source/net/yacy/crawler/data/CrawlProfile.java, source/net/yacy/server/serverObjects.java
Wed Oct 08 18:48:57 CEST 2014
by Michael Peter Christen
fixed crawl profile xml result which did not show the correct crawl
status.
Changed Files: htroot/CrawlProfileEditor_p.xml, source/net/yacy/crawler/data/CrawlProfile.java
Wed Oct 08 17:12:35 CEST 2014
by Michael Peter Christen
added another decoration flag to switch off network graphics in crawler
monitor and index browser: decoration.grafics.linkstructure
Please set this to false to remove the graphics from the interface.
Changed Files: defaults/yacy.init, htroot/Crawler_p.java, htroot/HostBrowser.java, source/net/yacy/search/SwitchboardConstants.java
Wed Oct 08 15:20:43 CEST 2014
by Michael Peter Christen
added a high cpu cycle monitor to PerformanceQueues
Changed Files: htroot/PerformanceQueues_p.html, htroot/PerformanceQueues_p.java, htroot/PerformanceQueues_p.xml, source/net/yacy/kelondro/workflow/AbstractBusyThread.java, source/net/yacy/kelondro/workflow/BusyThread.java
Wed Oct 08 15:04:35 CEST 2014
by Michael Peter Christen
less volume for effect sounds
Changed Files: htroot/yacy/search.java, htroot/yacy/transferURL.java, source/net/yacy/gui/Audio.java, source/net/yacy/search/Switchboard.java
Wed Oct 08 14:27:38 CEST 2014
by Michael Peter Christen
less load and more ram prerequisite for crawl steps
Changed Files: defaults/yacy.init
Tue Oct 07 23:42:41 CEST 2014
by Michael Peter Christen
removed the atmo sound clips because they had been too large
Changed Files: htroot/env/soundclips/sources.txt, source/net/yacy/gui/Audio.java
Tue Oct 07 22:36:01 CEST 2014
by Michael Peter Christen
ipv6 fix: avoid that shrinked own ip set is overwritten with (non-valid)
set of local IPs
Changed Files: source/net/yacy/search/Switchboard.java
Tue Oct 07 18:32:39 CEST 2014
by Michael Peter Christen
argh.. adding missing java class for latest audio feature
Changed Files: source/net/yacy/gui/Audio.java
Mon Oct 06 04:51:31 CEST 2014
by reger
add link extraction to pdfParser 
this extracts clickable links in pdf and adds it to the list of links

include a test case for this function

this is the corrected comment for commit:
https://gitorious.org/yacy/rc1/commit/aa2e15d846cdee90b70ea882747148b14f257c49
Changed Files: source/net/yacy/document/parser/pdfParser.java
Sun Oct 05 20:05:03 CEST 2014
by reger
allow url parameter in worktable apicall
 allow url=wwwl?param=a¶m=b (with ?, & encoded)
fix:  http://mantis.tokeek.de/view.php?id=100

fix double adding of  '&' in MultiProtocolURL.escape()
Changed Files: source/net/yacy/document/parser/pdfParser.java, test/net/yacy/document/parser/pdfParserTest.java, test/parsertest/umlaute_linux.pdf
Sun Oct 05 14:50:22 CEST 2014
by orbiter
lazy handling of process_sxt field (part of postprocessing)
Changed Files: source/net/yacy/search/schema/CollectionConfiguration.java
Sat Oct 04 04:11:48 CEST 2014
by reger
allow url parameter in worktable apicall
 allow url=wwwl?param=a¶m=b (with ?, & encoded)
fix:  http://mantis.tokeek.de/view.php?id=100

fix double adding of  '&' in MultiProtocolURL.escape()
Changed Files: htroot/Load_RSS_p.java, source/net/yacy/cora/document/id/MultiProtocolURL.java, source/net/yacy/data/WorkTables.java
Fri Oct 03 22:08:07 CEST 2014
by reger
preserve content_type (mime) if supplied in preference of construct in from file type.
(this eventually can benefit image search by using mime only)

reduce redundant field assignment for Solrdocuments created from URIMetadataNode (URIMetadataNode = SolrDocument with partially assigned fields)

Changed Files: source/net/yacy/search/schema/CollectionConfiguration.java
Fri Oct 03 20:54:45 CEST 2014
by reger
upd to jsoup-1.8.1.jar
Changed Files: .classpath, addon/YaCy.app/Contents/Info.plist, build.xml, lib/jsoup-1.8.1.jar, nbproject/project.xml, pom.xml
Fri Oct 03 20:49:40 CEST 2014
by reger
open rejected urls in new browser
Changed Files: htroot/IndexCreateParserErrors_p.html
Fri Oct 03 01:43:05 CEST 2014
by reger
fix image search expand box, cut-off of 2nd capture line height
tested with IE11 and Firefox 32 (change worked for both to show 2nd line without cutting off height)

+fix charset parameter in metadataImageParser
+update start errMsgTxt to "java 1.7"
Changed Files: htroot/js/highslide/highslide.js, source/net/yacy/document/parser/images/metadataImageParser.java, source/net/yacy/yacy.java
Thu Oct 02 09:38:06 CEST 2014
by Michael Peter Christen
typo in javadoc
Changed Files: source/net/yacy/cora/protocol/HeaderFramework.java
Wed Oct 01 23:53:41 CEST 2014
by reger
add html5 autofocus to query input field
(leave onload untouched = redundant, for IE9 http://www.w3schools.com/tags/att_input_autofocus.asp)

adjust Peer-to-Peer/ Privacy switch label 
to display "Peer-to-Peer" as 2nd switch option in active stealth mode
Changed Files: htroot/index.html, htroot/yacysearch.html, htroot/yacysearch_location.html, htroot/yacysearchtrailer.html
Wed Oct 01 04:35:34 CEST 2014
by reger
handle  noarchive tag, skip writing page to cache
http://mantis.tokeek.de/view.php?id=44
Changed Files: source/net/yacy/crawler/data/Cache.java, source/net/yacy/search/schema/CollectionConfiguration.java, source/net/yacy/search/schema/CollectionSchema.java
Wed Oct 01 03:47:57 CEST 2014
by Michael Peter Christen
when pinging other peers, be able to select the right IP option
Changed Files: htroot/Network.java, source/net/yacy/peers/Network.java, source/net/yacy/peers/Protocol.java
Tue Sep 30 22:22:13 CEST 2014
by reger
search result showPicture update search parameter
used parameter &cat=image is obsolete and returns no results
- remove &cat=image and &cat=href references
- remove &tenant= references (unused)
Use contentdom=image and inurl: parameter to make showPicture link display something (open in new window because of used inurl modifier changes original query)
Changed Files: htroot/index.java, htroot/yacysearch.html, htroot/yacysearch.java, htroot/yacysearchitem.html, htroot/yacysearchtrailer.java
Tue Sep 30 05:04:47 CEST 2014
by reger
added metadataImageParser for tif and psd (Photoshop) images.
This is a modified genericImageParser adding tif (and psd) support even if java ImageIO plugin for tif is not installed in JDK.
Adds just tif and psd to the available parsers.
Uses the same library to extract metadata, so could eventually be merged with genericImageParser.
All detected metadata are added to the parsed document (potentially some more as with genericImageParser)
Changed Files: source/net/yacy/document/TextParser.java, source/net/yacy/document/parser/images/genericImageParser.java, source/net/yacy/document/parser/images/metadataImageParser.java
Mon Sep 29 07:42:51 CEST 2014
by reger
use javax ImageIO getReader to add supported image extension/mime
genericImageParser uses javax ImageIO, supported images depend on available plugins for ImageIO package (this is JDK installation specific). Jpeg, png and gif are availabel by default. Tif and others only on avalable plugin (in classpath).
Add supported image type dynamically on startup.
Changed Files: source/net/yacy/document/parser/images/genericImageParser.java
Mon Sep 29 02:24:29 CEST 2014
by reger
remove unused variable timeout
Changed Files: source/net/yacy/search/query/QueryParams.java
Fri Sep 26 23:49:10 CEST 2014
by reger
skip loader wait cycle on concurrent access in nocache configuration.
In nocache config resource is loaded online, leaving no benefit to wait for a faster cache hit.
Changed Files: source/net/yacy/repository/LoaderDispatcher.java
Wed Sep 24 13:32:58 CEST 2014
by Michael Peter Christen
activated the new apk parser which was already ready but not included in
the parser initialization. To make the apk parser usable, the handling
of application type links had to be modified. Now all documents which
have not a parser attached are placed to the noload-queue while all
other documents are parsed using the associated parser class. This may
have side-Effects on other parsers and the display of different file
classes (images, apps, videos).
Changed Files: source/net/yacy/crawler/CrawlStacker.java, source/net/yacy/crawler/retrieval/Request.java, source/net/yacy/document/TextParser.java, source/net/yacy/document/parser/apkParser.java
Mon Sep 22 15:28:54 CEST 2014
by orbiter
added a hack to forward solr search results from an external attached
solr to the YaCy built-in solr search servlet. Its not complete and not
fully correct (there is still a utf8 encoding problem) but it is a way
to get easily requests forwarded through YaCy to an external Solr.
Changed Files: source/net/yacy/cora/federate/solr/connector/AbstractSolrConnector.java, source/net/yacy/cora/federate/solr/responsewriter/EnhancedXMLResponseWriter.java, source/net/yacy/http/servlets/SolrSelectServlet.java, source/net/yacy/search/index/Fulltext.java
Sun Sep 21 22:35:03 CEST 2014
by reger
add link to thread pool settings in status panel
Changed Files: htroot/PerformanceQueues_p.html, htroot/Status_p.inc
Sun Sep 21 03:48:54 CEST 2014
by reger
fix NPE in ViewFile - show snippet
on document not in index
Changed Files: htroot/ViewFile.html, htroot/ViewFile.java
Sun Sep 21 00:10:20 CEST 2014
by reger
adjust link to peer in Network list
(www path obsolete)
Changed Files: htroot/Network.html
Sun Sep 21 00:04:54 CEST 2014
by reger
upd Maven pom (to current dev version)
Changed Files: pom.xml
Thu Sep 18 14:36:57 CEST 2014
by Michael Peter Christen
added warning for not well-formed postprocessing queries
Changed Files: source/net/yacy/search/schema/CollectionConfiguration.java
Thu Sep 18 14:26:45 CEST 2014
by Michael Peter Christen
added internal api for partial updates to Solr
Changed Files: source/net/yacy/cora/federate/solr/connector/AbstractSolrConnector.java, source/net/yacy/cora/federate/solr/connector/ConcurrentUpdateSolrConnector.java, source/net/yacy/cora/federate/solr/connector/SolrConnector.java
Thu Sep 18 11:11:09 CEST 2014
by orbiter
added option to reverse-sort YaCy tables (internal API change only)
Changed Files: htroot/Table_YMark_p.java, htroot/Tables_p.java, htroot/api/table_p.java, source/net/yacy/crawler/CrawlSwitchboard.java, source/net/yacy/kelondro/blob/Tables.java
Wed Sep 17 00:22:23 CEST 2014
by Michael Peter Christen
next development cycle. Please be careful with the usage of next
commits, maybe new and unstable things will come...
Changed Files: build.properties
Tue Sep 16 23:14:13 CEST 2014
by reger
catch TimeoutException during ping and do not delete yacy.conf during prereadconfigfile
found a situation after crash (reboot) with existing running semaphore but YaCy not running.
Ping generated exception which finally deleted the conf file (during pre-read procedure)
- change to ping (catch exception solved it)
- additionally removed delete yacy.conf file (if needed we need to make a backup)
Changed Files: source/net/yacy/cora/protocol/Scanner.java, source/net/yacy/cora/protocol/TimeoutRequest.java, source/net/yacy/migration.java, source/net/yacy/search/Switchboard.java, source/net/yacy/yacy.java
Tue Sep 16 16:43:17 CEST 2014
by reger
better fix for NPE in image search
replace https://gitorious.org/yacy/rc1/commit/8931e14514deff5ab66e7504c45bb78046e5f696
Changed Files: source/net/yacy/search/query/SearchEvent.java