admin管理员组文章数量:1399022
I am working with SQL Server and Apache Solr, and I am facing an issue where matters with no class code are still appearing when I apply a filter for EXISTS.
- Problem:
In SQL Server, we store trademarkclass_code and trademarkclass_description as multi-value columns. To maintain ordering and prevent NULL values, a zero-width space (\u200B or NCHAR(8203)) was inserted instead of NULL.
This results in Solr treating these fields as non-empty, even when they should be empty. When I filter for records where trademarkclass_code EXISTS, it includes matters that only contain \u200B, making the filter ineffective.
- SQL Server Example:
Data is stored like this:
2034\u200B ABCHELLO\u200B\u200B
Which is equivalent to:
2034 ABCHELLO
- \u200B is invisible but still stored as data.
- When sent to Solr, it is indexed as non-empty, causing incorrect search results.
- Solr Query Filter Code:
Below is the relevant Java method that constructs the Solr filter query:
if (CollectionUtils.isNotEmpty(matterFieldsFilters)) {
String solrFieldName = FipSolrMatterField.fromName(fieldName).getSolrName().toLowerCase();
for (SolrMatterFieldsFilter searchField : matterFieldsFilters) {
searchField.setName(solrFieldName);
}
StringBuilder queryStringBuilder = new StringBuilder();
for (SolrMatterFieldsFilter mFieldsFilter : matterFieldsFilters) {
SolrMatterFieldsOperator operation = SolrMatterFieldsOperator.fromOperatorId(mFieldsFilter.getOperatorId());
if (operation == SolrMatterFieldsOperator.EXISTS) {
queryStringBuilder.append("(" + mFieldsFilter.getName() + ":[* TO *])");
} else if (operation == SolrMatterFieldsOperator.NOT_EXISTS) {
queryStringBuilder.append("(-" + mFieldsFilter.getName() + ":[* TO *])");
} else if (StringUtils.isNotBlank(mFieldsFilter.getFieldValue())) {
String fieldValue = escapeQueryCharacters(mFieldsFilter.getFieldValue());
queryStringBuilder.append(buildSolrMatterFieldsQueryStr(mFieldsFilter.getName(), fieldValue, true, true, false));
}
}
return queryStringBuilder.toString();
}
return null;
}
- Attempted Fixes:
- Removing \u200B from SQL Server Before Indexing
UPDATE TrademarkClass SET trademarkclass_code = REPLACE(trademarkclass_code, NCHAR(8203), '')
✅ This works but requires modifying existing data.
- Excluding \u200B in Java Code Before Passing to Solr Modified this part in the Java code:
} else if (StringUtils.isNotBlank(mFieldsFilter.getFieldValue())
&& !mFieldsFilter.getFieldValue().contains("\u200B")) {
✅ This prevents filtering based on \u200B but doesn't remove it from the index.
- Questions:
What is the best way to ensure Solr treats \u200B values as empty or NULL?
Is there a better way to handle filtering so that EXISTS works correctly?
Should we use a different placeholder (like '-' or '[EMPTY]') instead of \u200B?
Any suggestions or best practices would be greatly appreciated!
版权声明:本文标题:java - Solr Filter Query Including Zero-Width Space (u200B) from SQL Server Instead of Null - Stack Overflow 内容由网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:http://www.betaflare.com/web/1744195250a2594709.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
发表评论