admin管理员组

文章数量:1404923

OutOfMemoryError while streaming 20M+ rows in Spring Boot with Hibernate Issue I’m streaming 20M+ rows from my Content table using a Stream<Object[]> in Spring Boot with Hibernate. However, I encounter java.lang.OutOfMemoryError: Java heap space, even though I expect each row to be garbage-collected after processing.

Error Logs (Partial)

Mar 09 14:13:51 : dev-pedagogy-learner-progress Exception in thread "http-nio-8080-Poller" java.lang.OutOfMemoryError: Java heap space Mar 09 14:13:51 : dev-pedagogy-learner-progress Exception in thread "Catalina-utility-1" java.lang.OutOfMemoryError: Java heap space Mar 09 14:13:51 : dev-pedagogy-learner-progress 2025-03-09T08:43:51.685Z INFO 1 --- [nio-8080-exec-2] o.s.core.annotation.MergedAnnotation : Failed to introspect annotations on java.lang.Object .springframework.boot.actuate.endpoint.web.servlet.AbstractWebMvcEndpointHandlerMapping$OperationHandler.handle(jakarta.servlet.http.HttpServletRequest,java.util.Map): java.lang.OutOfMemoryError: Java heap space Mar 09 14:13:51 : dev-pedagogy-learner-progress 2025-03-09T08:43:51.767Z ERROR 1 --- [ task-2] o.h.engine.jdbc.spi.SqlExceptionHelper : Java heap space Mar 09 14:13:51 : dev-pedagogy-learner-progress 2025-03-09T08:43:51.767Z WARN 1 --- [ task-2] o.h.engine.jdbc.spi.SqlExceptionHelper : SQL Error: 0, SQLState: S1000

Code for Streaming Data

I am using a Hibernate Stream<Object[]> query with a fetch size of 1000. After processing a row, I want it to be removed from memory, but it seems like memory consumption keeps increasing.

try (Stream<Object[]> stream = contentRepository.streamContentProgress(allChapterIds, alluserIds)) {
    stream.forEachOrdered(row -> {
        processRow(row, chapterToContentSet, contentToBatchProgressMap, programBatchToUserIdsMap);
        row = null; // Trying to free memory
    });
}

Code for Processing Each Row

private void processRow(Object[] row,
                        Map<Long, Set<Long>> chapterToContentSet,
                        Map<Long, Map<Integer, Long>> contentToBatchProgressMap,
                        Map<Integer, List<Long>> programBatchToUserIdsMap) {
    Long chapterId = ((Number) row[0]).longValue();
    Long contentId = ((Number) row[1]).longValue();
    Long userId = ((Number) row[2]).longValue();
    Long progress = ((Number) row[3]).longValue();

    chapterToContentSetputeIfAbsent(chapterId, k -> new HashSet<>()).add(contentId);

    for (Map.Entry<Integer, List<Long>> entry : programBatchToUserIdsMap.entrySet()) {
        Integer programBatchId = entry.getKey();
        List<Long> batchUserIds = entry.getValue();

        if (batchUserIds.contains(userId)) {
            contentToBatchProgressMap
                    puteIfAbsent(contentId, k -> new HashMap<>())
                    .merge(programBatchId, progress, Long::sum);
        }
    }

    // Attempting to free memory
    Arrays.fill(row, null);
    row = null;
    chapterId = null;
    contentId = null;
    userId = null;
    progress = null;
    // System.gc(); // (Not recommended, just an experiment)
}

Repository Query (Streaming)

@QueryHints({
    @QueryHint(name = ".hibernate.fetchSize", value = "1000"),
    @QueryHint(name = ".hibernate.cacheMode", value = "IGNORE")
})
@Query("SELECT c.chapterId, c.contentId, c.userId, c.progress FROM Content c " +
       "WHERE c.chapterId IN :chapterIds AND c.userId IN :userIds")
Stream<Object[]> streamContentProgress(@Param("chapterIds") List<Long> chapterIds,
                                       @Param("userIds") List<Long> userIds);

Entity Class

@Setter
@Getter
@RequiredArgsConstructor
@Entity
@SuperBuilder(toBuilder = true)
public class Content extends BaseEntity {

    @Id
    @GeneratedValue(strategy = GenerationType.IDENTITY)
    private Long id;

    @Column(name = "content_id", nullable = false)
    private Long contentId;

    @Column(name = "user_id", nullable = false)
    private Long userId;

    @Column(name = "progress", nullable = false)
    private Integer progress;

    @Column(name = "chapter_id", nullable = false)
    private Long chapterId;
}

What I Have Tried

Setting fetch size (1000) and disabling caching using @QueryHint

→ Still seeing memory growth Explicitly setting row = null and calling Arrays.fill(row, null)

→ No improvement Calling System.gc() (not recommended but tried for debugging)

→ No effect Checked for Hibernate first-level caching (cacheMode = IGNORE) → No caching issue

Questions

Why is memory not being freed up even though I’m using a stream and not collecting data?

Does Stream<Object[]> still retain previous results in memory?

Should I explicitly detach entities or manually clear the Hibernate session?

Is there a better way to process large datasets efficiently without hitting OutOfMemoryError?

Any help would be appreciated!

本文标签: spring bootJava stream row not getting cleared from heap spaceStack Overflow