YaCy Release 1.4

This is the YaCy main release 1.4.
This release includes mainly a deeper Solr integration, much more Solr
fields are filled, Solr has now mutli-core capabilities and a second
core with a webgraph was added (but deactivated for further testing).
The opensearch result writer of the integrated solr has now all the
feateures as the original opensearch result servlet of YaCy had, and the
file search interface "yacyinteractive" now uses this new result writer
instead the old one. The search of that interface is now much, much
faster. The default search functionality has undergone a full re-design
and a lot of testing was done to fix problems with the to-solr
migration. The normal search is now very fast, in portal mode and even
in p2p mode. The ranking was strongly enhanced, there is now a support
for flexible field boosts, boost functions and boost queries. All these
ranking functions had been made editable and there is a new
configuration sevlet for this. Furthermore, there are several ranking
schemas predefined, one for default internet search, one for
sort-by-date and one for intranet search requests, which is triggered
automatically if a site-operator is used. Intranet search ranking rates
deep links higher than shallow which returns more specific document
types. Remote searches are done using the local ranking profile, not the
remote profile. The selection of target peers had been enhanced, now all
robinson peers which have a solr interface are searched using that
interface rather with the old YaCy interface. There should also an
enhancement in indexing speed as there are less requests to the solr for
doing that and index updates are bundled together while forced commits
had been reduced using a new solr 4.1 soft-commit feature. There had
also been fixes to some memory leaks and the overall memory usage should
be lower. Finally, the logging had been renovated and logs are now done
using only the java-logger. All libraries, using different loggin
technology are routed to the java logger using logging gateways. The
complete number of log entries for solr updates had been strongly
reduced since these many logging entries had become a perfomance issue
in 1.3.

Major Changes   
Jump to: Bugfixes / Other Changes

CommitDescription
Wed Mar 13 14:47:00 CET 2013
by Michael Peter Christen
changes in ranking computation
- an existing ranking servlet for solr was extended. It is now possible
to set boost values for fields, boost functions and boost queries.
- The ranking can have different instances, but currently only the first
one is used
- added an abstraction layer for fields which can be used for search and
those fields can be edited in the solr ranking configruation
- the ranking value from solr within the field score is used to combine
remote search requests, which all are created using the same locally
defined boost values
- reduced the number of fields which are used for search (makes it
faster)
- replaced some text fields by string fields (makes indexing faster)
- removed classes which had no use
- made a large number of experiments for a better ranking and created a
temporary setting which prefers hits inside titles
- adjusted also the RWI-based ranking computation to 'prefer title'
- made special cases like for portal search where no post-processing and
post-ranking is wanted: this keeps the original ranking order as done by
Solr
- fixed many bugs with old settings for ranking
Changed Files: defaults/solr.collection.schema, defaults/solr.webgraph.schema, defaults/solr/schema.xml, defaults/yacy.init, htroot/ContentAnalysis_p.java, htroot/RankingSolr_p.html, htroot/RankingSolr_p.java, htroot/env/templates/header.template, htroot/env/templates/submenuIndexControl.template, htroot/gsa/searchresult.java, htroot/solr/select.java, htroot/yacysearchitem.java, source/net/yacy/cora/federate/opensearch/SRURSSConnector.java, source/net/yacy/cora/federate/solr/Ranking.java, source/net/yacy/cora/federate/solr/SchemaDeclaration.java, source/net/yacy/cora/federate/solr/connector/CachedSolrConnector.java, source/net/yacy/cora/federate/solr/instance/InstanceMirror.java, source/net/yacy/document/Condenser.java, source/net/yacy/kelondro/data/meta/URIMetadataNode.java, source/net/yacy/peers/Protocol.java, source/net/yacy/search/Switchboard.java, source/net/yacy/search/SwitchboardConstants.java, source/net/yacy/search/query/QueryGoal.java, source/net/yacy/search/query/QueryParams.java, source/net/yacy/search/query/SearchEvent.java, source/net/yacy/search/query/SearchEventCache.java, source/net/yacy/search/ranking/RankingProfile.java, source/net/yacy/search/schema/CollectionConfiguration.java, source/net/yacy/search/schema/CollectionSchema.java, source/net/yacy/search/schema/WebgraphSchema.java, source/net/yacy/search/snippet/ResultEntry.java
Fri Mar 01 15:27:17 CET 2013
by orbiter
- enhanced solr.add procedure for mass adds
- removed unused solr access classes
- made snippet generation for documents aus YaCy RWI/DHT concurrent (as
it was before the search process removation)
- reduced the number of remote results in settings file because the
processing of such mass documents add is too CPU-intensive (in Solr)
Changed Files: defaults/yacy.network.freeworld.unit, defaults/yacy.network.metager.unit, htroot/IndexFederated_p.java, source/net/yacy/cora/federate/solr/connector/AbstractSolrConnector.java, source/net/yacy/cora/federate/solr/connector/CachedSolrConnector.java, source/net/yacy/cora/federate/solr/connector/MirrorSolrConnector.java, source/net/yacy/cora/federate/solr/instance/RemoteInstance.java, source/net/yacy/search/Switchboard.java, source/net/yacy/search/index/Fulltext.java, source/net/yacy/search/query/SearchEvent.java
Wed Feb 27 22:40:23 CET 2013
by orbiter
corrected result counter
Changed Files: htroot/IndexControlRWIs_p.java, htroot/yacysearch.java, htroot/yacysearchitem.java, htroot/yacysearchlatestinfo.java, source/net/yacy/cora/sorting/WeakPriorityBlockingQueue.java, source/net/yacy/kelondro/data/citation/CitationReferenceFactory.java, source/net/yacy/kelondro/data/meta/URIMetadataNode.java, source/net/yacy/kelondro/data/meta/URIMetadataRow.java, source/net/yacy/kelondro/data/word/WordReferenceFactory.java, source/net/yacy/kelondro/data/word/WordReferenceVars.java, source/net/yacy/kelondro/rwi/ReferenceContainer.java, source/net/yacy/kelondro/rwi/ReferenceFactory.java, source/net/yacy/peers/graphics/WebStructureGraph.java, source/net/yacy/search/index/Fulltext.java, source/net/yacy/search/query/SearchEvent.java, source/net/yacy/search/ranking/ReferenceOrder.java
Tue Feb 26 17:16:31 CET 2013
by Michael Peter Christen
complete redesign of search process:
- removed 'worker' processes
- no internal time-out behaviour: methods either are successful or
return null
- waiting is only done on top-level
- removed snippet-production; this is replaced by solr snippets
- removed statistics based on solr size queries (they had been VERY
long); the statistics (like suggestions or tag cloud) are now again
based on the old but very fast RWI index. In portal or intranet mode the
RWI index is usually switched off; if you like to have statistics again
then you must switch on the rwis again in this mode.
- fixed many bugs regarding correct page counter
Changed Files: htroot/AccessTracker_p.java, htroot/IndexControlRWIs_p.java, htroot/js/yacysearch.js, htroot/suggest.java, htroot/yacy/search.java, htroot/yacysearch.java, htroot/yacysearchitem.java, htroot/yacysearchlatestinfo.java, htroot/yacysearchtrailer.java, source/net/yacy/cora/federate/solr/connector/EmbeddedSolrConnector.java, source/net/yacy/cora/federate/solr/connector/SolrServerConnector.java, source/net/yacy/data/DidYouMean.java, source/net/yacy/kelondro/data/meta/DigestURI.java, source/net/yacy/peers/Protocol.java, source/net/yacy/peers/RemoteSearch.java, source/net/yacy/search/Switchboard.java, source/net/yacy/search/index/Fulltext.java, source/net/yacy/search/index/Segment.java, source/net/yacy/search/query/AccessTracker.java, source/net/yacy/search/query/QueryParams.java, source/net/yacy/search/query/SearchEvent.java, source/net/yacy/search/query/SearchEventCache.java
Mon Feb 25 00:09:41 CET 2013
by Michael Peter Christen
- generalized SchemaConfiguration into super-class Configuration and
adopted other classes which used the configuration-only access for that
class
- removed many warnings
- adjusted logging
Changed Files: .gitignore, defaults/yacy.logging, defaults/yacy.network.freeworld.unit, htroot/ConfigHeuristics_p.java, htroot/yacy/ui/css/autocomplete.css, source/net/yacy/cora/federate/opensearch/OpenSearchConnector.java, source/net/yacy/cora/federate/solr/SchemaConfiguration.java, source/net/yacy/cora/federate/solr/connector/AbstractSolrConnector.java, source/net/yacy/cora/federate/solr/instance/EmbeddedInstance.java, source/net/yacy/cora/storage/Configuration.java, source/net/yacy/kelondro/blob/Heap.java, source/net/yacy/peers/Protocol.java, source/net/yacy/search/Switchboard.java, source/net/yacy/search/index/Fulltext.java, source/net/yacy/search/schema/CollectionConfiguration.java, source/net/yacy/search/schema/WebgraphConfiguration.java, source/net/yacy/upnp/impls/InternetGatewayDevice.java
Sun Feb 24 18:09:34 CET 2013
by Michael Peter Christen
- added flags in IndexFederated_p.html to switch on or off the webgraph
index (new solr core webgraph) .. this is now off by default
- completely redesigned this servlet
- added description how to attach a remote solr
- adjusted naming of servlet and menues
- moved 'lazy initialization' attribut from IndexSchema to
IndexFederated (this is a general option) back again.
Changed Files: defaults/yacy.init, htroot/IndexControlURLs_p.html, htroot/IndexFederated_p.html, htroot/IndexFederated_p.java, htroot/IndexSchema_p.html, htroot/IndexSchema_p.java, htroot/env/base.css, htroot/env/templates/header.template, htroot/env/templates/submenuIndexControl.template, source/net/yacy/search/Switchboard.java, source/net/yacy/search/SwitchboardConstants.java, source/net/yacy/search/index/DocumentIndex.java, source/net/yacy/search/index/Segment.java
Fri Feb 22 15:45:15 CET 2013
by Michael Peter Christen
added the generation of 50 (!!) new solr field in the core 'webgraph'.
The default schema uses only some of them and the resting search index
has now the following properties:
- webgraph size will have about 40 times as much entries as default
index
- the complete index size will increase and may be about the double size
of current amount
As testing showed, not much indexing performance is lost. The default
index will be smaller (moved fields out of it); thus searching
can be faster.
The new index will cause that some old parts in YaCy can be removed,
i.e. specialized webgraph data and the noload crawler. The new index
will make it possible to:
- search within link texts of linked but not indexed documents (about 20
times of document index in size!!)
- get a very detailed link graph
- enhance ranking using a complete link graph

To get the full access to the new index, the API to solr has now two
access points: one with attribute core=collection1 for the default
search index and core=webgraph to the new webgraph search index. This is
also avaiable for p2p operation but client access is not yet
implemented.
Changed Files: .classpath, defaults/solr.collection.schema, defaults/solr.webgraph.schema, htroot/ConfigHeuristics_p.java, htroot/CrawlStartScanner_p.java, htroot/Crawler_p.html, htroot/Crawler_p.java, htroot/HostBrowser.java, htroot/IndexControlRWIs_p.java, htroot/IndexControlURLs_p.java, htroot/IndexSchema_p.html, htroot/IndexShare_p.java, htroot/Load_RSS_p.java, htroot/ServerScannerList.java, htroot/ViewFile.java, htroot/api/getpageinfo.java, htroot/api/getpageinfo_p.java, htroot/api/status_p.java, htroot/api/status_p.xml, htroot/api/webstructure.java, htroot/env/templates/header.template, htroot/gsa/searchresult.java, htroot/js/Crawler.js, htroot/solr/select.java, htroot/yacy/query.java, htroot/yacy/transferRWI.java, htroot/yacy/transferURL.java, source/net/yacy/cora/federate/opensearch/OpenSearchConnector.java, source/net/yacy/cora/federate/solr/connector/AbstractSolrConnector.java, source/net/yacy/cora/federate/solr/connector/EmbeddedSolrConnector.java, source/net/yacy/cora/federate/solr/connector/SolrConnector.java, source/net/yacy/cora/federate/solr/connector/SolrServerConnector.java, source/net/yacy/cora/federate/solr/instance/InstanceMirror.java, source/net/yacy/cora/protocol/HeaderFramework.java, source/net/yacy/cora/protocol/RequestHeader.java, source/net/yacy/cora/protocol/Scanner.java, source/net/yacy/crawler/CrawlStacker.java, source/net/yacy/crawler/data/ResultImages.java, source/net/yacy/crawler/retrieval/HTTPLoader.java, source/net/yacy/crawler/retrieval/RSSLoader.java, source/net/yacy/crawler/retrieval/SitemapImporter.java, source/net/yacy/data/BookmarkHelper.java, source/net/yacy/document/Condenser.java, source/net/yacy/document/Document.java, source/net/yacy/document/parser/html/ContentScraper.java, source/net/yacy/document/parser/html/EmbedEntry.java, source/net/yacy/document/parser/html/ImageEntry.java, source/net/yacy/document/parser/html/ScraperInputStream.java, source/net/yacy/document/parser/htmlParser.java, source/net/yacy/document/parser/images/genericImageParser.java, source/net/yacy/document/parser/rssParser.java, source/net/yacy/document/parser/sevenzipParser.java, source/net/yacy/document/parser/sitemapParser.java, source/net/yacy/document/parser/swfParser.java, source/net/yacy/document/parser/tarParser.java, source/net/yacy/document/parser/vcfParser.java, source/net/yacy/document/parser/zipParser.java, source/net/yacy/kelondro/data/meta/DigestURI.java, source/net/yacy/peers/Transmission.java, source/net/yacy/peers/graphics/WebStructureGraph.java, source/net/yacy/peers/operation/yacyRelease.java, source/net/yacy/repository/LoaderDispatcher.java, source/net/yacy/search/Switchboard.java, source/net/yacy/search/index/Fulltext.java, source/net/yacy/search/index/Segment.java, source/net/yacy/search/query/QueryParams.java, source/net/yacy/search/schema/CollectionConfiguration.java, source/net/yacy/search/schema/CollectionSchema.java, source/net/yacy/search/schema/WebgraphConfiguration.java, source/net/yacy/search/schema/WebgraphSchema.java, source/net/yacy/search/snippet/MediaSnippet.java, source/net/yacy/server/http/HTTPDProxyHandler.java
Thu Feb 21 13:23:55 CET 2013
by Michael Peter Christen
introduced a second core named 'webgraph'. This core will hold the link
structure, but is not filled yet. To have the opportunity of a second
core, multi-core functionality had to be implemented to the
deep-embedded solr:
- migrated the solr_40 directory content to a subdirectory
'collection1'; the previously used default core is now called
collection1
- added solr_40/webgraph subdirectory as second core
- added a servlet configuration for the second core 'webgraph' in
/IndexSchema_p.html
- added instance handling as addition to solr connections: all solr
connectors are now instances of an solr 'instance' object; this required
a complete re-design of the solr embedding
- migrated also caching and sharding ontop of new instance handling
- migrated the search apis to handle now the access to a specific core,
the default core named 'collection1'
- migrated the remote solr search interface to access shards of cores;
for the yacy remote search the default core is now called 'solr'; using
the peer address as solr address
- migrated the solr backup and restore process: old backups cannot be
used after this migration!
- redesign of solr instance handling in all methods which access the
instances: they cannot hold copies of these instances any more; the must
retrieve the actuall connection object every time they want to write to
it (this solves also some bugs when switching the index/network)
- added another schema 'solr.webgraph.schema', the old solr.keys.list is
replaced by solr.collection.schema
Changed Files: defaults/solr.collection.schema, defaults/solr.webgraph.schema, defaults/solr/solr.xml, defaults/yacy.init, htroot/ConfigHeuristics_p.java, htroot/CrawlResults.java, htroot/CrawlStartExpert_p.java, htroot/Crawler_p.java, htroot/HostBrowser.java, htroot/IndexControlURLs_p.java, htroot/IndexFederated_p.java, htroot/IndexSchema_p.html, htroot/IndexSchema_p.java, htroot/PerformanceMemory_p.java, htroot/RankingSolr_p.java, htroot/api/schema.java, htroot/gsa/searchresult.java, htroot/migrateurldb_p.java, htroot/solr/select.java, htroot/yacyinteractive.java, source/net/yacy/cora/federate/opensearch/OpenSearchConnector.java, source/net/yacy/cora/federate/solr/Boost.java, source/net/yacy/cora/federate/solr/SchemaConfiguration.java, source/net/yacy/cora/federate/solr/SchemaDeclaration.java, source/net/yacy/cora/federate/solr/connector/AbstractSolrConnector.java, source/net/yacy/cora/federate/solr/connector/CachedSolrConnector.java, source/net/yacy/cora/federate/solr/connector/EmbeddedSolrConnector.java, source/net/yacy/cora/federate/solr/connector/MirrorSolrConnector.java, source/net/yacy/cora/federate/solr/connector/MultipleSolrConnector.java, source/net/yacy/cora/federate/solr/connector/RemoteSolrConnector.java, source/net/yacy/cora/federate/solr/connector/RetrySolrConnector.java, source/net/yacy/cora/federate/solr/connector/ShardSelection.java, source/net/yacy/cora/federate/solr/connector/ShardSolrConnector.java, source/net/yacy/cora/federate/solr/connector/SolrConnector.java, source/net/yacy/cora/federate/solr/connector/SolrServerConnector.java, source/net/yacy/cora/federate/solr/instance/EmbeddedInstance.java, source/net/yacy/cora/federate/solr/instance/InstanceMirror.java, source/net/yacy/cora/federate/solr/instance/RemoteInstance.java, source/net/yacy/cora/federate/solr/instance/ResponseAccumulator.java, source/net/yacy/cora/federate/solr/instance/ServerMirror.java, source/net/yacy/cora/federate/solr/instance/ServerShard.java, source/net/yacy/cora/federate/solr/instance/ShardInstance.java, source/net/yacy/cora/federate/solr/responsewriter/GSAResponseWriter.java, source/net/yacy/cora/federate/solr/responsewriter/JsonResponseWriter.java, source/net/yacy/cora/federate/solr/responsewriter/OpensearchResponseWriter.java, source/net/yacy/crawler/data/CrawlQueues.java, source/net/yacy/crawler/data/ZURL.java, source/net/yacy/kelondro/data/meta/URIMetadataNode.java, source/net/yacy/migration.java, source/net/yacy/peers/Protocol.java, source/net/yacy/peers/RemoteSearch.java, source/net/yacy/search/Switchboard.java, source/net/yacy/search/SwitchboardConstants.java, source/net/yacy/search/index/DocumentIndex.java, source/net/yacy/search/index/Fulltext.java, source/net/yacy/search/index/Segment.java, source/net/yacy/search/query/QueryGoal.java, source/net/yacy/search/query/QueryModifier.java, source/net/yacy/search/query/QueryParams.java, source/net/yacy/search/query/SearchEvent.java, source/net/yacy/search/schema/CollectionConfiguration.java, source/net/yacy/search/schema/CollectionSchema.java, source/net/yacy/search/schema/WebgraphConfiguration.java, source/net/yacy/search/schema/WebgraphSchema.java, source/net/yacy/server/serverObjects.java
Fri Feb 15 01:38:10 CET 2013
by Michael Peter Christen
Full redesign of solr connection architecture. This was done to support
multiple solr cores instead of just one. Therefore it is now necessary
to distuingish between solr server connections (called an 'Instance')
and a connection to a single solr core. One Instance may now have
multiple connector classes assigned to it, each connecting to a single
core.
To support multiple cores it is also necessary to distinguish between
the connection configuration and the configuration of the index schema.
We will have multiple schema configurations in the future, each for
every solr core. This caused that the IndexFederated servlet had to be
split into two parts, the new Servlet for the Schema editor is now in
the IndexSchema Servlet.
Changed Files: htroot/ConfigHeuristics_p.java, htroot/CrawlResults.java, htroot/CrawlStartExpert_p.java, htroot/IndexFederated_p.html, htroot/IndexFederated_p.java, htroot/IndexSchema_p.html, htroot/IndexSchema_p.java, htroot/api/schema.java, htroot/env/templates/submenuIndexControl.template, htroot/gsa/searchresult.java, source/net/yacy/cora/federate/opensearch/OpenSearchConnector.java, source/net/yacy/cora/federate/solr/Schema.java, source/net/yacy/cora/federate/solr/YaCySchema.java, source/net/yacy/cora/federate/solr/connector/AbstractSolrConnector.java, source/net/yacy/cora/federate/solr/connector/EmbeddedSolrConnector.java, source/net/yacy/cora/federate/solr/connector/MirrorSolrConnector.java, source/net/yacy/cora/federate/solr/connector/MultipleSolrConnector.java, source/net/yacy/cora/federate/solr/connector/RemoteSolrConnector.java, source/net/yacy/cora/federate/solr/connector/RetrySolrConnector.java, source/net/yacy/cora/federate/solr/connector/ShardSolrConnector.java, source/net/yacy/cora/federate/solr/connector/SolrConnector.java, source/net/yacy/cora/federate/solr/connector/SolrServerConnector.java, source/net/yacy/cora/federate/solr/instance/SolrEmbeddedInstance.java, source/net/yacy/cora/federate/solr/instance/SolrInstance.java, source/net/yacy/cora/federate/solr/instance/SolrRemoteInstance.java, source/net/yacy/crawler/data/CrawlQueues.java, source/net/yacy/data/BookmarkDate.java, source/net/yacy/peers/Protocol.java, source/net/yacy/search/Switchboard.java, source/net/yacy/search/index/DocumentIndex.java, source/net/yacy/search/index/Fulltext.java, source/net/yacy/search/index/Segment.java, source/net/yacy/search/index/SolrConfiguration.java, source/net/yacy/search/query/QueryParams.java, source/net/yacy/search/query/SearchEvent.java
Wed Feb 13 02:29:47 CET 2013
by Michael Peter Christen
removed the commitWithin attribute because that is not the way how the
index is updated the right way for us. May also be be superfluous with
the solr 4.0 softcommit.
Changed Files: defaults/yacy.init, htroot/IndexFederated_p.java, source/net/yacy/cora/federate/solr/connector/MirrorSolrConnector.java, source/net/yacy/cora/federate/solr/connector/MultipleSolrConnector.java, source/net/yacy/cora/federate/solr/connector/RetrySolrConnector.java, source/net/yacy/cora/federate/solr/connector/ShardSolrConnector.java, source/net/yacy/cora/federate/solr/connector/SolrConnector.java, source/net/yacy/cora/federate/solr/connector/SolrServerConnector.java, source/net/yacy/search/Switchboard.java, source/net/yacy/search/SwitchboardConstants.java, source/net/yacy/search/index/DocumentIndex.java, source/net/yacy/search/index/Fulltext.java
Tue Feb 12 03:42:46 CET 2013
by Michael Peter Christen
extended JSON Response Writer and Opensearch Response Writer for the
Solr search interface in such way that it is possible to use this
interface for the yacyinteractive search. This search interface is now
much faster using the Solr search directly. For the Solr interface it
was necessary to create a translation from the YaCy search modifiers to
the Solr facet selection. This was added in such a way that it becomes
generic for the normal YaCy search and as a on-top evaluation for Solr
queries.
Changed Files: htroot/js/yacyinteractive.js, htroot/solr/select.java, htroot/yacy/search.java, htroot/yacyinteractive.html, htroot/yacysearch.java, source/net/yacy/cora/document/UTF8.java, source/net/yacy/cora/federate/solr/responsewriter/JsonResponseWriter.java, source/net/yacy/cora/federate/solr/responsewriter/OpensearchResponseWriter.java, source/net/yacy/peers/Protocol.java, source/net/yacy/peers/RemoteSearch.java, source/net/yacy/search/query/QueryModifier.java, source/net/yacy/search/query/QueryParams.java, source/net/yacy/search/query/RankingProcess.java, source/net/yacy/search/query/SearchEvent.java
Mon Feb 04 10:55:49 CET 2013
by Michael Peter Christen
update to Solr 4.1.0
Changed Files: .classpath, addon/YaCy.app/Contents/Info.plist, build.xml, lib/lucene-analyzers-common-4.1.0.jar, lib/lucene-analyzers-phonetic-4.1.0.jar, lib/lucene-core-4.1.0.jar, lib/lucene-grouping-4.1.0.jar, lib/lucene-highlighter-4.1.0.jar, lib/lucene-memory-4.1.0.jar, lib/lucene-misc-4.1.0.jar, lib/lucene-queries-4.1.0.jar, lib/lucene-queryparser-4.1.0.jar, lib/lucene-spatial-4.1.0.jar, lib/lucene-suggest-4.1.0.jar, lib/solr-core-4.1.0.License, lib/solr-core-4.1.0.jar, lib/solr-solrj-4.1.0.License, lib/solr-solrj-4.1.0.jar, lib/zookeeper-3.4.5.jar, source/net/yacy/cora/federate/solr/SolrServlet.java, source/net/yacy/cora/federate/solr/connector/EmbeddedSolrConnector.java, source/net/yacy/search/index/Fulltext.java
Sun Feb 03 22:32:38 CET 2013
by reger
move testing SolrServlet.main to test, making include of jetty*.jar in distribution and classpath obsolete

- move jetty*.jar to test library 
- move SolrServlet.main as is to test, add also a junit test simulating main 
  - add build.xml cleanup for EmbeddedSolrConnectorTest created test/DATA
- adjust some test compile errors
Changed Files: addon/YaCy.app/Contents/Info.plist, build.xml, libt/jetty-6.1.26-patched-JETTY-1340.jar, libt/jetty-LICENSE-ASL.txt, libt/jetty-util-6.1.26-patched-JETTY-1340.jar, libt/jetty-util-LICENSE-ASL.txt, source/net/yacy/cora/federate/solr/SolrServlet.java, source/net/yacy/cora/federate/solr/connector/EmbeddedSolrConnector.java, test/de/anomic/document/ParserTest.java, test/de/anomic/yacy/yacyURLTest.java, test/net/yacy/cora/document/MultiProtocolURITest.java, test/net/yacy/cora/federate/solr/connector/EmbeddedSolrConnectorTest.java
Thu Jan 31 13:15:28 CET 2013
by Michael Peter Christen
optimizations when starting large crawl requests with many start urls in
one request:
- allow larger match-fields in html interface
- delete all host hashes at once from zurl
- when deleting by host, do not count size of deleted entries since that
was the reason it took so long
Changed Files: htroot/CrawlStartExpert_p.html, htroot/Crawler_p.java, source/net/yacy/cora/federate/solr/connector/MirrorSolrConnector.java, source/net/yacy/cora/federate/solr/connector/MultipleSolrConnector.java, source/net/yacy/cora/federate/solr/connector/RetrySolrConnector.java, source/net/yacy/cora/federate/solr/connector/ShardSolrConnector.java, source/net/yacy/cora/federate/solr/connector/SolrConnector.java, source/net/yacy/cora/federate/solr/connector/SolrServerConnector.java, source/net/yacy/cora/protocol/http/HTTPClient.java, source/net/yacy/crawler/data/ZURL.java, source/net/yacy/search/index/Fulltext.java
Wed Jan 23 14:40:58 CET 2013
by Michael Peter Christen
- changed solr commit call and added an optimize option. Since Solr
4.0.0 there is a new softcommit feature which implements a
near-real-time (NRT) search option. The softcommit does not do IO and
does not cause performance issues.
YaCy has now an extension in its solr connectors to use the softcommit
feature. The softcommit call now replaces all places where a hard commit
was used. Furthermore the commit strategy in when doing a search from
the web interface was changed (it's done every time before a search is
done).

The softcommit feature was implemented because it was needed for the
following changes (customer demands), which is also included in this
git commit:

- added a feature to identify all documents which have unique titles
and/or unique descriptions. These unique flags are disabled by default.
- added also a feature to set a flag when the url from a canonical tag
is equal to the document url. This is also disabled by default.

To support the new softcommit strategy, the commitWithinMs option was
set to -1 do disable automatic commit based on document insert times. If
documents are inserted permanently then also a commit would happen
permanently whenever the commitWithinMs time is reached. This would
conflict with the regular autocommit of 10 minutes and the new
softcommit strategy.
Changed Files: defaults/yacy.init, htroot/Crawler_p.java, htroot/HostBrowser.java, htroot/IndexFederated_p.java, htroot/index.java, htroot/yacyinteractive.java, htroot/yacysearch.java, source/net/yacy/cora/federate/solr/connector/EmbeddedSolrConnector.java, source/net/yacy/cora/federate/solr/connector/MirrorSolrConnector.java, source/net/yacy/cora/federate/solr/connector/MultipleSolrConnector.java, source/net/yacy/cora/federate/solr/connector/RetrySolrConnector.java, source/net/yacy/cora/federate/solr/connector/ShardSolrConnector.java, source/net/yacy/cora/federate/solr/connector/SolrConnector.java, source/net/yacy/cora/federate/solr/connector/SolrServerConnector.java, source/net/yacy/migration.java, source/net/yacy/search/Switchboard.java, source/net/yacy/search/index/DocumentIndex.java, source/net/yacy/search/index/Fulltext.java, source/net/yacy/search/index/Segment.java, source/net/yacy/search/index/SolrConfiguration.java, source/net/yacy/search/query/SearchEvent.java
Mon Jan 14 03:06:24 CET 2013
by reger
added (manual) urldb migration (link on: Index Administraton -> Federated Solr Index)
- migrates all entries in old urldb

Metadata coordinate (lat / lon) NumberFormatException still relative often (see excerpt below), 
- added try/catch for URIMetadataRow (seems not to be needed in URIMetaDataNode, as Solr internally checks for number format)
- removed possible typ conversion for lat() / lon() comparison with 0.0f, changed to 0.0  (leaving it to the compiler/optimizer to choose number format)

current log excerpt for NumberFormatException:
W 2013/01/14 00:10:07 StackTrace For input string: "-"
java.lang.NumberFormatException: For input string: "-"
	at sun.misc.FloatingDecimal.readJavaFormatString(Unknown Source)
	at java.lang.Double.parseDouble(Unknown Source)
	at net.yacy.kelondro.data.meta.URIMetadataRow$Components.lon(URIMetadataRow.java:525)
	at net.yacy.kelondro.data.meta.URIMetadataRow.lon(URIMetadataRow.java:279)
	at net.yacy.search.index.SolrConfiguration.metadata2solr(SolrConfiguration.java:277)
	at net.yacy.search.index.Fulltext.putMetadata(Fulltext.java:329)
	at transferURL.respond(transferURL.java:152)
...
Caused by: java.lang.NumberFormatException: For input string: "-"
	at sun.misc.FloatingDecimal.readJavaFormatString(Unknown Source)
	at java.lang.Double.parseDouble(Unknown Source)
	at net.yacy.kelondro.data.meta.URIMetadataRow$Components.lon(URIMetadataRow.java:525)
	at net.yacy.kelondro.data.meta.URIMetadataRow.lon(URIMetadataRow.java:279)
	at net.yacy.search.index.SolrConfiguration.metadata2solr(SolrConfiguration.java:277)
	at net.yacy.search.index.Fulltext.putMetadata(Fulltext.java:329)
	at transferURL.respond(transferURL.java:152)
Changed Files: htroot/IndexFederated_p.html, htroot/IndexFederated_p.java, htroot/migrateurldb_p.html, htroot/migrateurldb_p.java, source/net/yacy/document/Condenser.java, source/net/yacy/document/Document.java, source/net/yacy/kelondro/data/meta/URIMetadataRow.java, source/net/yacy/migration.java, source/net/yacy/search/index/Fulltext.java, source/net/yacy/search/index/SolrConfiguration.java, source/net/yacy/search/query/SearchEvent.java
Sat Jan 05 19:00:54 CET 2013
by reger
fix configuration for search page navigators
- added additional config page (ConfigSearchPage_p) for easy setup of search page layout (to not overload ConfigPortal page)
   - currently redundant setting with part of ConfigPortal page
- added missing config for filetype and protocol navigator
- adjusted init of SearchEvent to check navigation config setting
- renamed RankigProcess.getTopicNavigator to getTopics (to distiguish between added SearchEvent.getTopicNavigator)
Changed Files: htroot/ConfigSearchPage_p.html, htroot/ConfigSearchPage_p.java, htroot/env/templates/submenuSearchConfiguration.template, htroot/yacy/search.java, htroot/yacysearchtrailer.java, source/net/yacy/search/query/RankingProcess.java, source/net/yacy/search/query/SearchEvent.java
Fri Jan 04 16:39:34 CET 2013
by Michael Peter Christen
added separate delete commands for the local+remote solr index, the old
metadata and old rwi and for the citation index. The important
advancement is the separation of the citation index deletion because
that index is responsible for the linkdepth calculation. Now a search
index can be deleted without the citation index and that should cause
that less clickdepths must be post-processed.
Changed Files: htroot/IndexControlURLs_p.html, htroot/IndexControlURLs_p.java, source/net/yacy/cora/federate/solr/connector/MirrorSolrConnector.java, source/net/yacy/crawler/CrawlSwitchboard.java, source/net/yacy/kelondro/rwi/AbstractBufferedIndex.java, source/net/yacy/kelondro/rwi/IndexCell.java, source/net/yacy/search/index/Fulltext.java, source/net/yacy/search/index/Segment.java
Wed Jan 02 20:55:43 CET 2013
by Michael Peter Christen
Preparations to produce a click depth attribute in the search index.
This attribute can be used for ranking and for other purpose (demand by
customer)
The click depth is computed in two steps:
- during indexing the current fill-state of the reverse link index is
used to backtrack the current page to the root page. The length of that
backtrack is the clickdepth. But this does not discover the shortest
click depth. To get this, a second process to check again is needed
- added a process tag that can be used to do operations on the existing
index after a crawl; i.e. calculation the shortest clickpath. Added a
field to control this operation but not a method to operate on this.
- added a visualization of the clickpath length in the host browser
Changed Files: defaults/solr.keys.list, htroot/HostBrowser.html, htroot/HostBrowser.java, source/net/yacy/cora/federate/solr/ProcessType.java, source/net/yacy/cora/federate/solr/YaCySchema.java, source/net/yacy/kelondro/data/meta/DigestURI.java, source/net/yacy/kelondro/util/ByteBuffer.java, source/net/yacy/search/index/Segment.java, source/net/yacy/search/index/SolrConfiguration.java
Wed Jan 02 01:59:47 CET 2013
by reger
small sanitary fixes
- exclude unix shell scripts in NSIS windows install archive
- replace link to env/grafics/yacy.gif to yacy.png (build.nsi)
- remove unused code lines (Blacklist_p, Response, WordReferenceVars)
- type & xhtml (RankingSolr_p.html)
Changed Files: build.nsi, htroot/Blacklist_p.java, htroot/RankingSolr_p.html, htroot/YaCySearchPluginFF.html, htroot/opensearchdescription.xml, htroot/www/welcome.html, htroot/yacysearch.java, htroot/yacysearch_location.java, source/net/yacy/crawler/retrieval/Response.java, source/net/yacy/kelondro/data/word/WordReferenceVars.java
Sat Dec 29 08:24:48 CET 2012
by reger
Adding heuristic to get search results from configured systems which support opensearch specification
- any system supporting opensearch specification can be configured
- search query is only forwarded to remote system if not enough results available on local peer
- discover function provided, checking the local Solr index for links to opensearchdescription files, to add to the config
     - sample config file with some general search engines with opensearch support
Changed Files: defaults/heuristicopensearch.conf, defaults/yacy.init, htroot/ConfigHeuristics_p.html, htroot/ConfigHeuristics_p.java, htroot/yacysearch.java, source/net/yacy/cora/federate/opensearch/OpenSearchConnector.java, source/net/yacy/document/parser/xml/opensearchdescriptionReader.java


Bugfixes   
Jump to: YaCy Release 1.4 top / Other Changes

CommitDescription
Fri Mar 15 10:04:27 CET 2013
by Michael Peter Christen
fixed NPE during index abstract computation
Changed Files: htroot/yacy/search.java, source/net/yacy/search/query/QueryGoal.java
Fri Mar 15 09:35:57 CET 2013
by Michael Peter Christen
fix for wrong class name in log
Changed Files: source/net/yacy/server/http/HTTPDFileHandler.java
Thu Mar 14 03:30:25 CET 2013
by reger
fix error msg in ConfigHeuristics_p
Changed Files: htroot/ConfigHeuristics_p.java
Wed Mar 13 17:55:37 CET 2013
by orbiter
fix for possible memory leaks
Changed Files: htroot/ContentAnalysis_p.java, htroot/RankingSolr_p.java, htroot/yacysearch.java, source/net/yacy/kelondro/rwi/IndexCell.java, source/net/yacy/search/ResourceObserver.java, source/net/yacy/search/Switchboard.java, source/net/yacy/search/index/Segment.java
Sun Mar 10 19:46:06 CET 2013
by orbiter
fix for NPE if surrogates do not exist
Changed Files: source/net/yacy/search/Switchboard.java
Sun Mar 10 05:22:18 CET 2013
by reger
replace the terminateOldSessions - return immediate time from fixed 3 sec to  requested minage parameter
Changed Files: source/net/yacy/server/serverCore.java
Fri Mar 08 14:40:09 CET 2013
by orbiter
added/fixed missing DOCTYPE line (submitted by Thomas)
Changed Files: htroot/index.html
Thu Mar 07 15:31:00 CET 2013
by Michael Peter Christen
fix for wrong mime type in noload crawler
Changed Files: source/net/yacy/cora/document/analysis/Classification.java, source/net/yacy/crawler/retrieval/Response.java
Tue Mar 05 12:19:32 CET 2013
by orbiter
added debug switches for detailed search testing
Changed Files: defaults/yacy.init, source/net/yacy/peers/Protocol.java, source/net/yacy/peers/RemoteSearch.java, source/net/yacy/search/SwitchboardConstants.java, source/net/yacy/search/query/SearchEvent.java
Fri Mar 01 19:18:16 CET 2013
by orbiter
fix for search
Changed Files: source/net/yacy/peers/Protocol.java, source/net/yacy/search/index/Fulltext.java
Fri Mar 01 00:48:28 CET 2013
by orbiter
fix of page navigation for formatted totalcount numbers
Changed Files: htroot/js/yacysearch.js
Tue Feb 26 21:12:44 CET 2013
by bubu
fix link to IndexSchema_p.html
Changed Files: htroot/ConfigHeuristics_p.html
Sun Feb 24 18:17:58 CET 2013
by Michael Peter Christen
fix for webgraph delete query
Changed Files: source/net/yacy/search/index/Fulltext.java
Sat Feb 23 08:14:10 CET 2013
by Michael Peter Christen
fixes to schema
Changed Files: htroot/api/schema.xml, source/net/yacy/cora/federate/solr/connector/AbstractSolrConnector.java
Wed Feb 13 01:23:05 CET 2013
by Michael Peter Christen
fix to unbalanced tag and license for null objects
Changed Files: htroot/yacysearch.html, source/net/yacy/data/URLLicense.java
Mon Feb 11 13:28:08 CET 2013
by Michael Peter Christen
fix in html parser and bookmark generation
Changed Files: source/net/yacy/data/BookmarkDate.java, source/net/yacy/document/parser/html/TransformerWriter.java, source/net/yacy/document/parser/htmlParser.java
Fri Feb 08 15:12:10 CET 2013
by Michael Peter Christen
fix for xml blacklist import
Changed Files: source/net/yacy/data/list/XMLBlacklistImporter.java
Mon Feb 04 21:24:39 CET 2013
by Michael Peter Christen
catch more exceptions
Changed Files: source/net/yacy/cora/federate/solr/connector/RemoteSolrConnector.java, source/net/yacy/peers/Protocol.java
Mon Feb 04 18:04:52 CET 2013
by Michael Peter Christen
another NPE
Changed Files: source/net/yacy/peers/Protocol.java
Mon Feb 04 17:11:02 CET 2013
by Michael Peter Christen
fixes to internal RWI usage if RWI is switched off (NPE etc)
Changed Files: source/net/yacy/peers/Protocol.java, source/net/yacy/search/query/RankingProcess.java, source/net/yacy/search/query/SearchEvent.java
Mon Feb 04 16:42:10 CET 2013
by Michael Peter Christen
bugfixes and more logging for solr connector
Changed Files: htroot/api/schema.java, source/net/yacy/cora/federate/solr/connector/EmbeddedSolrConnector.java, source/net/yacy/cora/federate/solr/connector/RemoteSolrConnector.java, source/net/yacy/cora/federate/solr/connector/SolrServerConnector.java, source/net/yacy/search/query/QueryParams.java
Sat Feb 02 07:21:18 CET 2013
by Michael Peter Christen
fix for domain navigation
Changed Files: source/net/yacy/search/query/QueryParams.java, source/net/yacy/search/query/SearchEvent.java
Sat Feb 02 07:20:02 CET 2013
by Michael Peter Christen
NPE fix
Changed Files: source/net/yacy/cora/protocol/http/HTTPClient.java
Tue Jan 29 18:14:14 CET 2013
by orbiter
fixes to index enumeration for vocabulary production
Changed Files: source/net/yacy/search/index/Segment.java
Sun Jan 27 06:13:49 CET 2013
by reger
bugfix:  location url  for migrate urldb button onclick
Changed Files: htroot/IndexFederated_p.html
Sat Jan 26 03:59:39 CET 2013
by Michael Peter Christen
one more fix for author_sxt
Changed Files: htroot/api/schema.xml
Fri Jan 25 16:06:58 CET 2013
by Michael Peter Christen
catch exception if solr connection change fails
Changed Files: htroot/IndexFederated_p.java
Thu Jan 24 20:09:33 CET 2013
by Marc Nause
*) fixed admin password configuration
Changed Files: reconfigureYACY.sh
Thu Jan 24 17:57:28 CET 2013
by Michael Peter Christen
security fix for suggest (don't let users ask for too much)
Changed Files: htroot/suggest.java
Thu Jan 24 16:34:15 CET 2013
by Michael Peter Christen
fix for external solr schema definition
Changed Files: htroot/api/schema.xml
Thu Jan 24 14:12:31 CET 2013
by Michael Peter Christen
fix NPE when solr does not deliver snippets
Changed Files: source/net/yacy/peers/Protocol.java
Thu Jan 24 01:50:59 CET 2013
by Michael Peter Christen
fix for search result link in ViewFile
Changed Files: htroot/ViewFile.html
Wed Jan 23 04:11:55 CET 2013
by Copro
Fix typo embedd -> embed
Changed Files: htroot/WikiHelp.html, locales/de.lng
Thu Jan 17 21:52:56 CET 2013
by Michael Peter Christen
one more NPE fix
Changed Files: source/net/yacy/search/query/SearchEvent.java
Thu Jan 17 20:10:49 CET 2013
by sixcooler
bump to httpclient / httpcore 4.2.3 (bugfix-release)
Changed Files: .classpath, addon/YaCy.app/Contents/Info.plist, build.xml, lib/dependencies.txt, lib/httpclient-4.2.3.License, lib/httpclient-4.2.3.jar, lib/httpcore-4.2.3.License, lib/httpcore-4.2.3.jar, lib/httpmime-4.2.3.License, lib/httpmime-4.2.3.jar, nbproject/project.xml, source/net/yacy/cora/protocol/http/HTTPClient.java
Sat Jan 05 11:52:35 CET 2013
by Michael Peter Christen
fix for Network info
Changed Files: htroot/Network.html


Other Changes   
Jump to: YaCy Release 1.4 top / Bugfixes

CommitDescription
Fri Mar 15 10:25:47 CET 2013
by Michael Peter Christen
main release 1.4
Changed Files: build.properties
Fri Mar 15 10:00:06 CET 2013
by Michael Peter Christen
added a restart hint
Changed Files: source/net/yacy/yacy.java
Fri Mar 15 09:40:02 CET 2013
by Michael Peter Christen
turned severe message to warning message about network failure events
Changed Files: htroot/yacy/transferURL.java
Fri Mar 15 00:14:28 CET 2013
by Michael Peter Christen
- do not create a new query for all remote peers
- no document search this time
- adjusted banner and network to not show 'WORDS' but DHT Chunks. This
is to avoid confusion for robinson peers which do not create Word
Entries
Changed Files: defaults/yacy.init, htroot/Network.html, source/net/yacy/peers/Protocol.java, source/net/yacy/peers/RemoteSearch.java, source/net/yacy/peers/graphics/Banner.java, source/net/yacy/search/query/SearchEvent.java
Thu Mar 14 21:13:12 CET 2013
by Michael Peter Christen
use appropriate ranking for each search situation:
- when using the /date modifier, a date ranking profile is used
- when using a site: modifier, a ranking profile supporting longer urls
is used
Changed Files: defaults/yacy.init, source/net/yacy/search/query/QueryParams.java
Thu Mar 14 17:54:33 CET 2013
by Michael Peter Christen
added all clickdepth computations for source and target paths in
webstructure core
Changed Files: source/net/yacy/cora/federate/solr/SchemaConfiguration.java, source/net/yacy/search/Switchboard.java, source/net/yacy/search/schema/CollectionConfiguration.java, source/net/yacy/search/schema/WebgraphConfiguration.java
Thu Mar 14 12:13:02 CET 2013
by Michael Peter Christen
refactoring of clickdepth computation as preparation for clickdepth
computation of webgraph links
Changed Files: source/net/yacy/search/Switchboard.java, source/net/yacy/search/index/Segment.java, source/net/yacy/search/schema/CollectionConfiguration.java
Thu Mar 14 10:35:21 CET 2013
by Michael Peter Christen
removed unused tag fields
Changed Files: defaults/solr.collection.schema, htroot/ConfigHeuristics_p.java, source/net/yacy/search/schema/CollectionConfiguration.java, source/net/yacy/search/schema/CollectionSchema.java, source/net/yacy/search/schema/WebgraphConfiguration.java
Thu Mar 14 03:10:54 CET 2013
by reger
adjust Opensearch discover function to new webgraph Solr schema
Changed Files: htroot/ConfigHeuristics_p.html, htroot/ConfigHeuristics_p.java, source/net/yacy/cora/federate/opensearch/OpenSearchConnector.java
Thu Mar 14 01:35:38 CET 2013
by orbiter
added clickdepth field writing for webgraph core (unfinished)
Changed Files: defaults/solr.webgraph.schema, source/net/yacy/search/schema/CollectionConfiguration.java, source/net/yacy/search/schema/WebgraphConfiguration.java, source/net/yacy/search/schema/WebgraphSchema.java
Tue Mar 12 03:13:14 CET 2013
by reger
set RootNodeFlag only if EmbeddedSolr is connected (as RootNodes may receive direct Solr queries)
Changed Files: source/net/yacy/peers/Protocol.java
Mon Mar 11 10:46:29 CET 2013
by Michael Peter Christen
removed target_tag_s (superfluous)
Changed Files: source/net/yacy/search/schema/CollectionConfiguration.java, source/net/yacy/search/schema/WebgraphConfiguration.java, source/net/yacy/search/schema/WebgraphSchema.java
Sun Mar 10 02:26:24 CET 2013
by Michael Peter Christen
- added more selection criteria for network seed list
- enhanced up script
Changed Files: bin/up.sh, htroot/Network.java
Tue Mar 05 21:28:22 CET 2013
by Michael Peter Christen
fixes to search debugging after testing with the different search
debugging options
Changed Files: defaults/yacy.init, htroot/ConfigProperties_p.html, source/net/yacy/peers/Protocol.java, source/net/yacy/peers/RemoteSearch.java, source/net/yacy/search/SwitchboardConstants.java, source/net/yacy/search/index/Segment.java, source/net/yacy/search/query/SearchEvent.java
Tue Mar 05 12:24:01 CET 2013
by Michael Peter Christen
concurrent snippet fetching from solr results which do not have snippets
Changed Files: source/net/yacy/search/index/Fulltext.java, source/net/yacy/search/query/SearchEvent.java
Mon Mar 04 21:18:54 CET 2013
by orbiter
added filter queries for better image, audio and video results
Changed Files: source/net/yacy/search/query/QueryParams.java
Mon Mar 04 13:01:24 CET 2013
by Michael Peter Christen
added missing cleanup statements for short memory cases during search
Changed Files: source/net/yacy/search/query/SearchEvent.java, source/net/yacy/search/query/SearchEventCache.java
Mon Mar 04 12:01:10 CET 2013
by orbiter
do not put the fulltext field text_t into the search cache because it is
not used there and uses a lot of memory
Changed Files: source/net/yacy/peers/Protocol.java
Mon Mar 04 01:13:17 CET 2013
by Michael Peter Christen
in method exists() also use the new caching-stacks for
documents/metadata
Changed Files: source/net/yacy/kelondro/data/meta/URIMetadataNode.java, source/net/yacy/search/index/Fulltext.java
Mon Mar 04 00:17:29 CET 2013
by Michael Peter Christen
enhanced the search result processing
- no waiting time at the end
- switched on 'classic' snippet production and verification (again)
Changed Files: source/net/yacy/search/query/SearchEvent.java
Mon Mar 04 00:07:52 CET 2013
by Michael Peter Christen
DHT-transferred metadata and crawl receipts now also use the delayed
search cache to prevent that too much IO load is on the peer during
search.
Changed Files: htroot/yacy/crawlReceipt.java, htroot/yacy/transferURL.java
Sun Mar 03 23:45:47 CET 2013
by Michael Peter Christen
better protection against OOM during search flush and fixed missing
result push
Changed Files: defaults/yacy.init, source/net/yacy/search/index/Fulltext.java
Sun Mar 03 22:38:50 CET 2013
by Michael Peter Christen
- enhanced concurrency during search without IO blocking
- introduced a second queue to flush remote search results (now: old
metadata structure from DHT peers)
- fixed result counters
Changed Files: htroot/yacysearch.java, htroot/yacysearchitem.java, htroot/yacysearchlatestinfo.java, source/net/yacy/peers/Protocol.java, source/net/yacy/search/index/Fulltext.java, source/net/yacy/search/query/SearchEvent.java
Sun Mar 03 20:38:20 CET 2013
by Marc Nause
*) For some reason this seems to fix a ClassCastException on my system
(OpenJDK).
Changed Files: htroot/robots.java
Sat Mar 02 10:25:52 CET 2013
by Michael Peter Christen
made index storage from DHT search result concurrently. This prevents
blocking by high CPU usage during search. Also: removed query from Solr
for DHT search results; results are taken from the pending queue.
Changed Files: defaults/yacy.init, source/net/yacy/kelondro/data/meta/URIMetadataNode.java, source/net/yacy/peers/Protocol.java, source/net/yacy/search/Switchboard.java, source/net/yacy/search/SwitchboardConstants.java, source/net/yacy/search/index/Fulltext.java, source/net/yacy/search/schema/CollectionConfiguration.java
Thu Feb 28 23:55:46 CET 2013
by orbiter
Übersetzung des Domain Navigators als Anbieter Navigator; ist als Nutzen
besser erklärbar
Changed Files: locales/de.lng
Thu Feb 28 14:04:08 CET 2013
by orbiter
better/less requests to local solr; the request is made in chunks which
are exactly at only that size which is needed to present the current
search result page. This will also cause that next solr request are made
automatically during switching to next pages.
Changed Files: source/net/yacy/peers/RemoteSearch.java, source/net/yacy/search/query/SearchEvent.java
Thu Feb 28 02:25:39 CET 2013
by Michael Peter Christen
disabled clickdepth computation during craling since that is repeated
during clean-up phase.
Changed Files: source/net/yacy/cora/federate/solr/connector/AbstractSolrConnector.java, source/net/yacy/cora/federate/solr/connector/SolrServerConnector.java, source/net/yacy/search/Switchboard.java, source/net/yacy/search/schema/CollectionConfiguration.java
Wed Feb 27 20:58:34 CET 2013
by orbiter
removed the dns prefetch because that was not soo useful
Changed Files: source/net/yacy/crawler/CrawlStacker.java
Wed Feb 27 11:43:36 CET 2013
by orbiter
added recrawl/reload to CrawlStartSite for a timeout of 3 days
Changed Files: htroot/CrawlStartSite_p.html
Wed Feb 27 08:24:37 CET 2013
by orbiter
added option to create empty vocabularies
Changed Files: htroot/Vocabulary_p.html, htroot/Vocabulary_p.java
Tue Feb 26 17:53:44 CET 2013
by Michael Peter Christen
removed size request
Changed Files: source/net/yacy/search/Switchboard.java
Mon Feb 25 14:31:50 CET 2013
by Michael Peter Christen
testing to use solr for portalsearch caused some bugfixing but no full
success: try to comment out the solr search request in
yacy-portalsearch.js
Changed Files: htroot/portalsearch/yacy-portalsearch.js, htroot/solr/select.java, source/net/yacy/kelondro/logging/Log.java, source/net/yacy/search/query/QueryGoal.java, source/net/yacy/server/serverObjects.java, source/net/yacy/upnp/impls/InternetGatewayDevice.java
Mon Feb 25 01:13:03 CET 2013
by Michael Peter Christen
fix for schema export to consider also automatically generated
coordinate fields
Changed Files: .gitignore, htroot/api/schema.java, source/net/yacy/cora/federate/solr/SolrType.java, source/net/yacy/cora/federate/solr/connector/SolrServerConnector.java, source/net/yacy/peers/Protocol.java, source/net/yacy/search/schema/CollectionConfiguration.java, source/net/yacy/search/schema/CollectionSchema.java
Sat Feb 23 16:45:05 CET 2013
by Michael Peter Christen
- Removed log4j from libraries. This can be removed because the package
log4j-over-slf4j is there. From slf4j all loggings are routed to the jdk
logger. Now all loggings are consistently done to the jdk logger.
- added some lines to the logging properties to suppress many solr
logging statements. The number of the logging entries had already become
a performance issue, therefore removing these from the log should
increase performance.
Changed Files: .classpath, addon/YaCy.app/Contents/Info.plist, build.xml, defaults/yacy.logging
Sat Feb 23 14:33:17 CET 2013
by orbiter
updated wstx-asl to 3.2.9
Changed Files: .classpath, addon/YaCy.app/Contents/Info.plist, build.xml, lib/wstx-asl-3.2.9.jar
Fri Feb 22 22:17:45 CET 2013
by reger
on remote Solr search take only locally enabled schema fields from remote solrdocument for the inputdocument added to local index
Changed Files: source/net/yacy/cora/federate/solr/YaCySchema.java, source/net/yacy/peers/Protocol.java, source/net/yacy/search/Switchboard.java, source/net/yacy/search/index/Segment.java
Fri Feb 22 22:03:32 CET 2013
by reger
remove obsolete Solr "commit within" input field from IndexFederated
see https://gitorious.org/yacy/rc1/commit/41116066548be3d7987d7eaa73f2aac43e6f1e43
Changed Files: htroot/IndexFederated_p.html
Sun Feb 17 03:26:46 CET 2013
by reger
remove CPGEN from Windows batch files 
(classpath for all needed libraries is defined in manifest  of yacycore.jar)
Changed Files: build.xml, startYACY.bat, stopYACY.bat
Sat Feb 16 20:33:27 CET 2013
by orbiter
fixed interactive search which caused an error if pubDate is not present
in a search result
Changed Files: htroot/js/yacyinteractive.js
Fri Feb 15 01:58:28 CET 2013
by Michael Peter Christen
prevent that crawl starts with very large url lists cause a time-out in
the user front-end
Changed Files: source/net/yacy/search/Switchboard.java
Wed Feb 13 19:29:40 CET 2013
by Marc Nause
*) removed Skype online indicator (was not working anymore)
*) updated ICQ URLs
Changed Files: htroot/ViewProfile.html, htroot/ViewProfile.java
Wed Feb 13 01:11:57 CET 2013
by Michael Peter Christen
added jsonp option to yjson result writer
Changed Files: htroot/solr/select.java, source/net/yacy/cora/federate/solr/responsewriter/JsonResponseWriter.java
Wed Feb 13 00:33:53 CET 2013
by Michael Peter Christen
Added image license generation for solr image search results when
results are generated within yjson result writer. This makes it possible
to view images in yacyinteractive from solr.
Changed Files: htroot/ViewImage.java, htroot/js/yacyinteractive.js, htroot/yacysearchitem.java, source/net/yacy/cora/document/MultiProtocolURI.java, source/net/yacy/cora/federate/solr/responsewriter/JsonResponseWriter.java, source/net/yacy/data/URLLicense.java, source/net/yacy/search/Switchboard.java
Wed Feb 13 00:01:38 CET 2013
by Michael Peter Christen
fixed json search, quotes, auto-facets, urls etc. for
yacyinteractive.html
Changed Files: htroot/solr/select.java, htroot/yacyinteractive.html, source/net/yacy/cora/federate/solr/responsewriter/JsonResponseWriter.java, source/net/yacy/search/query/QueryModifier.java, source/net/yacy/server/http/HTTPDemon.java, source/net/yacy/server/serverObjects.java
Tue Feb 12 22:03:10 CET 2013
by Michael Peter Christen
Moved methods from SolrServerConnector to AbstractSolrConnector with the
result that most of these methods become superfluous in other classes.
This is a generalization step towards multi-indexes in Solr.
Changed Files: source/net/yacy/cora/federate/solr/connector/AbstractSolrConnector.java, source/net/yacy/cora/federate/solr/connector/EmbeddedSolrConnector.java, source/net/yacy/cora/federate/solr/connector/MirrorSolrConnector.java, source/net/yacy/cora/federate/solr/connector/MultipleSolrConnector.java, source/net/yacy/cora/federate/solr/connector/RetrySolrConnector.java, source/net/yacy/cora/federate/solr/connector/SolrServerConnector.java
Tue Feb 12 12:21:29 CET 2013
by Michael Peter Christen
better filesearch layout
Changed Files: htroot/js/yacyinteractive.js
Tue Feb 12 12:00:54 CET 2013
by Michael Peter Christen
reduced number of facets in yacyinteractive (only filetype necessary)
Changed Files: htroot/js/yacyinteractive.js, htroot/yacyinteractive.html
Tue Feb 12 11:52:33 CET 2013
by Michael Peter Christen
reverted put-semantics back to as-usual in serverObjects and introduced
an add-method to put in several objects for the same key
Changed Files: htroot/ViewFile.java, htroot/yacysearchitem.java, source/net/yacy/server/http/HTTPDFileHandler.java, source/net/yacy/server/http/HTTPDemon.java, source/net/yacy/server/http/TemplateEngine.java, source/net/yacy/server/serverObjects.java
Mon Feb 11 22:53:19 CET 2013
by reger
make sure yacy.running is deleted if not running (catch exception)
- to prevent following log if YaCy was previously not properly shutdown 

E ... STARTUP WARNING: the file C:\src\git\yacy-rc1\DATA\yacy.running exists, this usually means that a YaCy instance is still running
E ... STARTUP FATAL ERROR: java.util.concurrent.TimeoutException
java.util.concurrent.ExecutionException: java.util.concurrent.TimeoutException
	at net.yacy.cora.protocol.TimeoutRequest.call(TimeoutRequest.java:91)
	at net.yacy.cora.protocol.TimeoutRequest.ping(TimeoutRequest.java:112)
	at net.yacy.yacy.startup(yacy.java:200)
	at net.yacy.yacy.main(yacy.java:638)
Caused by: java.util.concurrent.TimeoutException

- adjust Netbeans path (to solr4.1.jars)
Changed Files: nbproject/project.xml, source/net/yacy/yacy.java
Mon Feb 11 22:12:15 CET 2013
by Michael Peter Christen
extended the serverObjects to be able to hold multipel values for a
single key. This is done using the solr class MultiMapSolrParams. That
class is needed in the OpensearchResultWriter to get multiple facet
requests.
Changed Files: htroot/gsa/searchresult.java, source/net/yacy/data/WorkTables.java, source/net/yacy/server/http/HTTPDFileHandler.java, source/net/yacy/server/http/HTTPDemon.java, source/net/yacy/server/http/TemplateEngine.java
Mon Feb 11 22:10:14 CET 2013
by Michael Peter Christen
added more metadata fields and facets to OpensearchResponseWriter.
This should make it possible to replace the original and enriched yacy
opensearch result with a solr output in opensearch format.
Changed Files: defaults/solr.keys.list, htroot/yacysearchitem.java, source/net/yacy/cora/document/RSSMessage.java, source/net/yacy/cora/federate/solr/responsewriter/OpensearchResponseWriter.java, source/net/yacy/cora/lod/vocabulary/YaCyMetadata.java, source/net/yacy/server/serverObjects.java, source/net/yacy/server/servletProperties.java
Sat Feb 09 06:57:20 CET 2013
by Michael Peter Christen
moved bookmarks back to more prominent location (even if this does not
fit to the 'Search Interfaces' headline)
Changed Files: htroot/env/templates/header.template
Sat Feb 09 06:55:57 CET 2013
by Michael Peter Christen
better error handling for bookmarks
Changed Files: htroot/Bookmarks.java, htroot/api/bookmarks/get_bookmarks.java, htroot/api/bookmarks/get_folders.java, htroot/api/bookmarks/posts/all.java, htroot/api/bookmarks/posts/get.java, htroot/api/bookmarks/tags/addTag_p.java, htroot/api/bookmarks/xbel/xbel.java, htroot/api/ymarks/import_ymark.java, source/net/yacy/data/BookmarksDB.java
Fri Feb 08 18:30:08 CET 2013
by Michael Peter Christen
when searching the network, do not search on robinson peers with the old
DHT search interface. Now use the solr interface.
Changed Files: source/net/yacy/peers/DHTSelection.java, source/net/yacy/peers/RemoteSearch.java
Fri Feb 08 17:58:54 CET 2013
by Michael Peter Christen
A robinson peer does not need to write RWI data if such peers are only
searched using the solr interface. Searching public rpbinsons will be
done with solr only in the future.
Changed Files: source/net/yacy/peers/Protocol.java, source/net/yacy/search/Switchboard.java
Fri Feb 08 12:45:54 CET 2013
by Michael Peter Christen
fixed a problem with re-feeding of already indexed documents whith
coordinates attached.
Changed Files: source/net/yacy/cora/federate/solr/YaCySchema.java, source/net/yacy/peers/Protocol.java, source/net/yacy/search/Switchboard.java, source/net/yacy/search/index/Segment.java
Thu Feb 07 23:39:00 CET 2013
by Michael Peter Christen
After the observation that Windows user simply forget that they started
YaCy; YaCy is still running and the user additionally expect that
another doubleclick on the YaCy icon simply opens the search windows
(again) I decided to add a function that complies to the expectation to
the user: simply open the browser pop-up page again if the user starts
YaCy while YaCy is still running.
Changed Files: source/net/yacy/yacy.java
Tue Feb 05 21:02:32 CET 2013
by Marc Nause
*) only install files from the RELEASE directory
*) minor changes
Changed Files: htroot/ConfigUpdate_p.java, htroot/Steering.html, htroot/Steering.java, source/net/yacy/kelondro/util/FileUtils.java
Tue Feb 05 12:47:20 CET 2013
by Michael Peter Christen
added a disable function in RemoteCrawl_p servlet which prevents setting
of remote crawl if peer is not a senior or principal peer
Changed Files: htroot/RemoteCrawl_p.html, htroot/RemoteCrawl_p.java, source/net/yacy/search/Switchboard.java
Mon Feb 04 21:24:57 CET 2013
by Michael Peter Christen
show a link for the host in the host browser; see
Changed Files: htroot/HostBrowser.html, skins/generic_pd.css, skins/pdblue.css
Mon Feb 04 19:57:28 CET 2013
by Marc Nause
*) added protection against CSRF in update download page
(http://localhost:8090/ConfigUpdate_p.html?releaseinstall=../../test.txt&deleteRelease=Delete+Release
does not work anymore)
Changed Files: htroot/ConfigUpdate_p.java, source/net/yacy/kelondro/util/FileUtils.java
Mon Feb 04 17:48:04 CET 2013
by Michael Peter Christen
use thread-safe http connection manager for authenticated remote solr
connections
Changed Files: source/net/yacy/cora/federate/solr/connector/RemoteSolrConnector.java, source/net/yacy/cora/federate/solr/connector/SolrServerConnector.java
Mon Feb 04 12:02:37 CET 2013
by Michael Peter Christen
arrr... forgot the new library
Changed Files: lib/guava-13.0.1.jar
Mon Feb 04 11:21:05 CET 2013
by Michael Peter Christen
guava update
Changed Files: .classpath, addon/YaCy.app/Contents/Info.plist, build.xml
Sun Feb 03 23:24:19 CET 2013
by sixcooler
remove jetty from classpath - as it was moved last commit
Changed Files: .classpath
Sat Feb 02 10:52:39 CET 2013
by orbiter
removed unused import
Changed Files: source/net/yacy/yacy.java
Sat Feb 02 09:51:43 CET 2013
by Michael Peter Christen
enhanced network scanner, is faster and more flexible now
- start more processes
- remove superfluous host name resolution
- better/more flexible subnet ip range calculation
- prefer ipv4 makes better usable ip pre-settings in servlet
- extended servlet by new subnet /20 - option
- redesign of scanner start process in servlet (generalization)
Changed Files: htroot/CrawlStartScanner_p.html, htroot/CrawlStartScanner_p.java, source/net/yacy/cora/protocol/Domains.java, source/net/yacy/cora/protocol/Scanner.java, source/net/yacy/yacy.java
Sat Feb 02 07:20:56 CET 2013
by Michael Peter Christen
less search overhead when first result set is smaller than requested
Changed Files: source/net/yacy/peers/RemoteSearch.java
Wed Jan 30 19:33:48 CET 2013
by orbiter
ability to create vocabularies also without any objectspace: this
iterates over all urls in the index do create terms
Changed Files: htroot/Vocabulary_p.java, source/net/yacy/search/index/Segment.java
Tue Jan 29 03:01:57 CET 2013
by reger
adding classpath to Manfiest of yacycore.jar 
- this allows to start w/o giving explicite java -cp (just java -jar lib/yacycore.jar works)
- especially helpful while running YaCy as Win service, 
  making it obsolete to adjust classpath cfg of the service wrapper on upgrades of lib/*.jar's
Changed Files: build.xml
Mon Jan 28 17:50:23 CET 2013
by Michael Peter Christen
allow more links when starting a crawl by file
Changed Files: htroot/Crawler_p.java
Sat Jan 26 23:43:09 CET 2013
by reger
correct headermenue in migrateurldb_p.html
- update NetBeans project path
Changed Files: htroot/migrateurldb_p.html, nbproject/project.xml
Sat Jan 26 03:34:46 CET 2013
by Michael Peter Christen
- add the copyField author_sxt only if author exists
- set the solr default search field according to existing fields
Changed Files: htroot/api/schema.java, htroot/api/schema.xml
Fri Jan 25 13:57:10 CET 2013
by Michael Peter Christen
Merge remote-tracking branch 'aleksejs/rutrans3'
Changed Files: locales/ru.lng
Fri Jan 25 13:05:48 CET 2013
by Aleksej
Russian translation fixes and additions
Changed Files: locales/ru.lng
Fri Jan 25 10:08:08 CET 2013
by Dmitriy Kazimirov
Russian localization:index.html fix
Changed Files: locales/ru.lng
Fri Jan 25 04:24:36 CET 2013
by sixcooler
clear some more caches if running out of memory
Changed Files: source/net/yacy/cora/protocol/Domains.java, source/net/yacy/repository/Blacklist.java, source/net/yacy/search/ResourceObserver.java
Thu Jan 24 18:25:28 CET 2013
by Michael Peter Christen
added a copyField for author_sxt for automated schema generation
Changed Files: htroot/api/schema.xml
Thu Jan 24 18:24:31 CET 2013
by Michael Peter Christen
turned author_s into the multi-valued field author_sxt
Changed Files: defaults/solr/schema.xml, source/net/yacy/cora/federate/solr/YaCySchema.java, source/net/yacy/search/query/QueryParams.java, source/net/yacy/search/query/SearchEvent.java
Thu Jan 24 12:39:19 CET 2013
by Michael Peter Christen
migrated the index export methods from the old metadata to solr. Now
exports are done using solr queries. removed superfluous methods and
servlets.
Changed Files: htroot/CrawlResults.java, htroot/Crawler_p.java, htroot/IndexControlURLs_p.html, htroot/IndexControlURLs_p.java, htroot/IndexControlURLs_p.xml, source/net/yacy/crawler/data/ResultURLs.java, source/net/yacy/search/index/Fulltext.java, source/net/yacy/search/query/QueryGoal.java
Thu Jan 24 03:26:38 CET 2013
by Michael Peter Christen
removed field selection because that created documents with that field
only which was not useful when re-writing the same document
Changed Files: source/net/yacy/search/index/Segment.java
Wed Jan 23 11:11:45 CET 2013
by Dmitriy Kazimirov
Russian localization update
Changed Files: locales/ru.lng
Wed Jan 23 14:41:17 CET 2013
by Michael Peter Christen
Merge remote-tracking branch 'copro/master'
Changed Files: htroot/WikiHelp.html, locales/de.lng, source/net/yacy/data/wiki/WikiCode.java
Wed Jan 23 04:00:15 CET 2013
by Copro
Adding Vimeo tag to wiki commands to embedd Video video with id
Changed Files: htroot/WikiHelp.html, locales/de.lng, source/net/yacy/data/wiki/WikiCode.java
Wed Jan 23 02:43:58 CET 2013
by Copro
Added feature to embedd Youtube videos to wiki commands for usage in
Wiki, Blog or other servlets
Changed Files: htroot/WikiHelp.html, locales/de.lng, source/net/yacy/data/wiki/WikiCode.java
Tue Jan 22 17:01:49 CET 2013
by Michael Peter Christen
Merge remote-tracking branch 'reger/master'
Changed Files: assembly.xml, libbuild/GitRevTask/GitRevTask.java, libbuild/pom.xml, pom.xml
Tue Jan 22 17:01:18 CET 2013
by Michael Peter Christen
Merge remote-tracking branch 'copro/master'
Changed Files: locales/de.lng
Tue Jan 22 15:33:49 CET 2013
by Copro
Some more German translation reducing the amount of Unused String
messages
Changed Files: locales/de.lng
Tue Jan 22 13:19:07 CET 2013
by Aleksej
Russian translation fixes not merged due to conflict
Changed Files: locales/ru.lng
Tue Jan 22 11:54:38 CET 2013
by Michael Peter Christen
Merge remote-tracking branch 'aleksejs/fixtrans'

Conflicts:
	locales/ru.lng
	
Tried to merge this but I had to made this 'blind'.
Sorry if I deleted something that was right.
Changed Files: htroot/ConfigHTCache_p.java, locales/cn.lng, locales/ru.lng, source/net/yacy/cora/protocol/Scanner.java, source/net/yacy/crawler/data/Cache.java
Tue Jan 22 05:14:37 CET 2013
by Copro
Added German translation for HostBrowser.html
Changed Files: locales/de.lng
Mon Jan 21 15:32:12 CET 2013
by Dmitriy Kazimirov
updated Russian localization for update system
Changed Files: locales/ru.lng
Sun Jan 20 14:01:29 CET 2013
by Dmitriy Kazimirov
A little more fixes for Russian localization
Changed Files: locales/ru.lng
Sun Jan 20 11:26:11 CET 2013
by Dmitriy Kazimirov
A little more fixes for Russian localization
Changed Files: locales/ru.lng
Sat Jan 19 22:52:37 CET 2013
by Dmitriy Kazimirov
Little more correct and readable Russian localization
Changed Files: locales/ru.lng
Sat Jan 19 22:43:55 CET 2013
by Dmitriy Kazimirov
Little more correct and readable Russian localization
Changed Files: locales/ru.lng
Sat Jan 19 14:21:47 CET 2013
by Dmitriy Kazimirov
More Russian translations. And if some text is not translated it will be in English and not German
Changed Files: locales/ru.lng
Mon Jan 21 18:02:29 CET 2013
by Michael Peter Christen
added new solr fields (unused yet; implementation will follow)
Changed Files: defaults/solr.keys.list, source/net/yacy/cora/federate/solr/YaCySchema.java
Mon Jan 21 17:59:42 CET 2013
by Michael Peter Christen
removed archaic migration code
Changed Files: source/net/yacy/server/serverSwitch.java, source/net/yacy/yacy.java
Mon Jan 21 17:55:28 CET 2013
by Michael Peter Christen
Reverted setting of MMapDirectoryFactory from solrconfig; see
http://forum.yacy-websuche.de/viewtopic.php?p=27509#p27509
Instead, in the start script is checked if the host is a 64 host and
-Dsolr.directoryFactory=solr.MMapDirectoryFactory is set as java option

Reverted the ramBufferSizeMB setting (this was not enabled anyway)
because that may be too much memory for small peers and embedded
systems.

Activated the mergeFactor 4; this was commented out by mistake
Changed Files: defaults/solr/solrconfig.xml, startYACY.sh
Sun Jan 20 21:08:59 CET 2013
by reger
add Maven build script
Changed Files: assembly.xml, libbuild/GitRevTask/GitRevTask.java, libbuild/pom.xml, pom.xml
Sat Jan 19 11:21:33 CET 2013
by orbiter
solr performance settings
the target of these performance settings is the reduction of IO in
general and during search in particual.
- reduced mergeFactor to 4. This will increase the IO during indexing,
but will reduce IO during search. It will also greatly reduce the number
of open files which should make it possible to have overall larger
indexes until the number of open files in an OS is reached.
- increased ramBufferSizeMB to 256mb. This will reduce the number of
commits. This change may compensate the reduction of the mergeFactor.
- disabled updateLog. This is a real-time search feature which is
available in YaCy anyway because a commit is forced if index.html is
called. The updateLog feature causes a lot of IO during indexing and
search and produced a lot of files in SEGMENTS/solr_40/data/tlog
Changed Files: defaults/solr/solrconfig.xml
Thu Jan 17 01:04:50 CET 2013
by Michael Peter Christen
set the 'all' option as option at end of the list because the all option
currently select also lists which cannot be exported in xml correctly
Changed Files: htroot/BlacklistImpExp_p.html
Wed Jan 16 17:38:06 CET 2013
by Michael Peter Christen
fix for wrong robots.txt loading for https protocol
see also: http://forum.yacy-websuche.de/viewtopic.php?f=5&t=4579
Changed Files: source/net/yacy/crawler/robots/RobotsTxt.java
Wed Jan 16 16:18:03 CET 2013
by Michael Peter Christen
integrated search term into opensearch result title. this makes better
bookmark names when subscribing multiple search results from the same
peer
Changed Files: htroot/yacysearch.rss
Wed Jan 16 14:54:35 CET 2013
by Michael Peter Christen
relaxing site operator for www prefix:
- when using a site operator search for a domain where the domain has a
www prefix, also the domain without the www is enclosed
- when using a site operator search for a domain where the domain has no
www prefix, also the domain with the www in enclosed
- in the host navigator, all domains with and without a www prefix are
accumulated. That means that the host navigator does never show a host
with a www prefix.
This should prevent usage mistakes of the site operator.
Changed Files: source/net/yacy/peers/Protocol.java, source/net/yacy/search/query/QueryParams.java, source/net/yacy/search/query/SearchEvent.java
Wed Jan 16 14:35:37 CET 2013
by Michael Peter Christen
using MMapDirectoryFactory as solution for ClosedChannelException given
in https://issues.apache.org/jira/browse/SOLR-2247
Changed Files: defaults/solr/solrconfig.xml
Wed Jan 16 11:07:20 CET 2013
by Michael Peter Christen
fixed a NPE which may appear for freeworld peers without any rwi index
data. This the NPE looked like:
Caused by: java.lang.NullPointerException
	at net.yacy.search.query.SearchEvent.<init>(SearchEvent.java:279)
	at
net.yacy.search.query.SearchEventCache.getEvent(SearchEventCache.java:155)
	at search.respond(search.java:314)
	... 12 more
Changed Files: source/net/yacy/search/query/SearchEvent.java
Tue Jan 15 16:20:43 CET 2013
by Michael Peter Christen
added a timeout for topic computation (solr is here much slower than the
old metadata-db)
Changed Files: htroot/yacy/search.java, source/net/yacy/search/query/RankingProcess.java, source/net/yacy/search/query/SearchEvent.java
Mon Jan 14 12:50:21 CET 2013
by Michael Peter Christen
added a 'inlink' search option according to the suggestion in the YaCy
forum at 
http://forum.yacy-websuche.de/viewtopic.php?f=18&t=4572#p27410

The feature was not called 'haslink' but called 'inlink' to have a
analogous naming like 'inurl'. This causes now that you can search for
words in links of the document, like:
* inlink:yacy
searches all documents which link to pages which have an 'yacy' in the
url.
Changed Files: htroot/index.html, htroot/yacy/search.java, htroot/yacysearch.java, source/net/yacy/search/query/QueryParams.java
Mon Jan 14 12:33:01 CET 2013
by Michael Peter Christen
with strict compiler settings, IndexFederated_p does not compile without
@SuppressWarnings("deprecation")
Changed Files: htroot/IndexFederated_p.java
Sat Jan 12 15:20:23 CET 2013
by reger
prevent checking of urldb if empty
- disconnect urlIndexFile if empty
- add missing lock class in submenuSearchConfiguration
Changed Files: htroot/env/templates/submenuSearchConfiguration.template, source/net/yacy/search/index/Fulltext.java
Sat Jan 05 20:47:18 CET 2013
by reger
read defaults from yacy.init for "Set to Defaults" button
Changed Files: htroot/ConfigPortal.java, htroot/ConfigSearchPage_p.java
Sat Jan 05 01:00:18 CET 2013
by Michael Peter Christen
activated the clickdepth_i attribute for solr again because the
calculcation of that value is not as extensive as expected and
furthermore the value is very useful for ranking
Changed Files: defaults/solr.keys.list
Sat Jan 05 00:58:27 CET 2013
by Michael Peter Christen
added also a re-calculation of reference counts during the
post-processing of clickcount calculations. This is a really nice thing
to have because the reference count affects ranking.
Changed Files: source/net/yacy/search/Switchboard.java
Sat Jan 05 00:37:52 CET 2013
by Michael Peter Christen
added 'Last Hour' to network statistics
Changed Files: htroot/Network.html, htroot/Network.java
Fri Jan 04 16:37:39 CET 2013
by Michael Peter Christen
added the clickdepth post-processing: some links may have 'shortcuts' to
already calculated click depths. There are then calculated if the crawl
buffer is empty and therefore no new 'shortcuts' can be discovered.
The status of the clickdepth stack (to-be-processed) can be seen using a
solr search command like this:
http://localhost:8090/solr/select?q=process_sxt:[*%20TO%20*]&start=0&rows=30&fl=sku,clickdepth_i,process_sxt
Changed Files: htroot/HostBrowser.java, source/net/yacy/search/Switchboard.java, source/net/yacy/search/index/SolrConfiguration.java
Thu Jan 03 19:21:21 CET 2013
by Michael Peter Christen
enhanced root-url detection
Changed Files: htroot/IndexControlRWIs_p.java, source/net/yacy/kelondro/data/meta/DigestURI.java, source/net/yacy/kelondro/index/RowHandleMap.java, source/net/yacy/search/index/SolrConfiguration.java, source/net/yacy/search/ranking/ReferenceOrder.java
Thu Jan 03 01:30:05 CET 2013
by Michael Peter Christen
clickpath should not be active by default because it needs extensive
computation - partly to be implemented
Changed Files: defaults/solr.keys.list
Thu Jan 03 01:27:16 CET 2013
by Michael Peter Christen
moved HTCache, Heuristics and Parser servlet to a more appropriate menu
location
Changed Files: htroot/ConfigHTCache_p.html, htroot/ConfigHeuristics_p.html, htroot/ConfigParser.html, htroot/ViewFile.html, htroot/env/templates/header.template, htroot/env/templates/submenuConfig.template, htroot/env/templates/submenuIndexControl.template, htroot/env/templates/submenuSearchConfiguration.template
Wed Jan 02 19:05:48 CET 2013
by Michael Peter Christen
removed warnings
Changed Files: source/net/yacy/cora/federate/opensearch/OpenSearchConnector.java
Wed Jan 02 15:08:07 CET 2013
by Michael Peter Christen
- Merge commit '168b1d130d9d67b5e8855a0b50c4ba7ad4a416f8'
- fixed conflict in	htroot/yacysearch.java
- removed nedres check because that causes that the remote server is not
called at all in most cases (local index has already results but we want
more)
- fixed a regex bug (a '=' too much)
Changed Files: defaults/heuristicopensearch.conf, defaults/yacy.init, htroot/ConfigHeuristics_p.html, htroot/ConfigHeuristics_p.java, htroot/yacysearch.java, source/net/yacy/cora/federate/opensearch/OpenSearchConnector.java, source/net/yacy/document/parser/xml/opensearchdescriptionReader.java
Sun Dec 30 02:13:48 CET 2012
by reger
fix: no results with configuration citation reference index switched off 
- urlcitationindex != null check added to ResultEntry.referencesCount
- plus other places where conflicting procedure was used (and urlcitationindex not already checked != null)
Changed Files: htroot/api/webstructure.java, htroot/api/yacydoc.java, source/net/yacy/search/snippet/ResultEntry.java
Sat Dec 29 17:47:34 CET 2012
by orbiter
added a filterscannerfail attribute to QueryParams which causes that a
check to the network scanner fail/success status can be used/suppressed
for search results. This is a feature that comes with the port scanner.
Changed Files: htroot/yacy/search.java, htroot/yacysearch.java, source/net/yacy/search/query/QueryParams.java, source/net/yacy/search/query/SearchEvent.java
Sat Dec 29 04:53:20 CET 2012
by reger
fix:  Broken Link on Crawler_p.html - issue 218 
http://bugs.yacy.net/view.php?id=218
- reduced Solr logging (/select)
Changed Files: defaults/yacy.logging, htroot/Crawler_p.html
Thu Dec 27 13:56:13 CET 2012
by Michael Peter Christen
added missing extension 'mkv' for navigation
Changed Files: source/net/yacy/cora/document/analysis/Classification.java, source/net/yacy/search/query/SearchEvent.java
Thu Dec 27 10:01:10 CET 2012
by reger
Add config option to show HostBrowser link in search result
- ConfigPortal: added checkbox Host Browser
- yacy.init: added search.result.show.hostbrowser as default = on (true)
- fix HostBrowser: broken link to protected WebStructurePicture for public user
Changed Files: defaults/yacy.init, htroot/ConfigPortal.html, htroot/ConfigPortal.java, htroot/HostBrowser.html, htroot/HostBrowser.java, htroot/yacysearchitem.html, htroot/yacysearchitem.java