admin管理员组文章数量:1123951
I’m encountering an issue where the number of documents in my Elasticsearch index exceeds the number of rows returned by my MySQL query. Specifically:
- MySQL Query Result: 4802 rows.
- Elasticsearch Index Document Count: 5515 documents.
This discrepancy persists even though the query results in MySQL are consistent. I suspect something is happening in my Logstash pipeline or Elasticsearch indexing process.
Logstash Configuration:
Here is my Logstash pipeline:
Input Configuration:
jdbc {
jdbc_connection_string => "jdbc:mysql://ipadresss:3306/SUPER_TEST_TRACK"
jdbc_user => "admin"
jdbc_password => "something"
jdbc_driver_class => "com.mysql.cj.jdbc.Driver"
jdbc_driver_library => "/usr/share/java/mysql-connector-java-9.1.0.jar"
jdbc_paging_enabled => true
jdbc_page_size => 9000
statement => "SELECT
s.site AS siteName,
DATE_FORMAT(p.date_production, '%Y-%m-%d') AS productionDate,
SUM(p.initial_quantity * pcv.criterion_value) AS ptotalMeterage
FROM
palet p
JOIN palettes_criteria pc ON p.palet_id = pc.palette_id
JOIN criterion_value pcv ON pc.criterion_value_id = pcv.criterion_value_id
JOIN criterion c ON pcv.criterion_id = c.id_criterion
JOIN site s ON p.id_site = s.id_site
WHERE
c.criterion_name = 'unp'
GROUP BY
s.site, productionDate
ORDER BY
productionDate"
schedule => "* * * * *"
}
Filter Configuration:
mutate {
rename => {
"siteName" => "site_name"
"productionDate" => "production_date"
"ptotalMeterage" => "p_total_meterage"
}
remove_field => ["@timestamp", "@version"]
}
Output Configuration:
elasticsearch {
hosts => ["http://ipaddress:9200"]
index => "produced-palette-meterage-per-site-and-date"
document_id => "palet-%{sitename}-%{productiondate}-%{ptotalmeterage}"
}
Observations:
- Consistent MySQL Query Results:
- Running the SQL query directly in MySQL consistently returns 4802 rows.
- Example row from MySQL:
| siteName | productionDate | ptotalMeterage |
| BERRECHID | 2023-12-17 | 14364.279999999972 |
- Elasticsearch Index:
- Document count (
docs.count
): 5515 - Example document in Elasticsearch:
{
"_index": "produced-palette-meterage-per-site-and-date",
"_type": "_doc",
"_id": "palet-BERRECHID-2023-12-17-14364.279999999972",
"_source": {
"sitename": "BERRECHID",
"productiondate": "2023-12-17",
"ptotalmeterage": 14364.279999999972
}
}
where is the issue and how to solve it?
本文标签: logstashElasticsearch Index Contains More Documents Than MySQL Query ResultsStack Overflow
版权声明:本文标题:logstash - Elasticsearch Index Contains More Documents Than MySQL Query Results - Stack Overflow 内容由网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:http://www.betaflare.com/web/1736610155a1945423.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
发表评论