admin管理员组

文章数量:1122832

I need to get a large bundle of data (10MB) from RTDB. All data store into stations node. When I get all data in single request, sometimes my Android app crashes with OutOfMemory error. So I need use paginated reading, but I've got an issue with it.

Request of reading full data from stations takes approx 10-12 seconds in my case. But when I make several requests (pagination) reading 2MB for a page, each request takes also approx 10 seconds. As a result total time of getting stations data enlarged from 10 seconds (single read) to 50 seconds (paged read). Can I make pagination work faster? Thanks.

Structure:

Rules:

Pagination logic:

private fun fetchDayStationsPaged(dayTag: String, lastNodeId: String? = null, stations: MutableList<StationCloud> = mutableListOf(), callback: (data: List<StationCloud>, errorMessage: LoadError?) -> Unit){
    val path = String.format(TimelineManager.KEY_TIMELINE_STATIONS, dayTag)

    val query = if (lastNodeId == null)
        database
            .getReference(path)
            .orderByKey()
            .limitToFirst(2000) //1 node takes approx 1KB
    else
        database
            .getReference(path)
            .orderByKey()
            .startAfter(lastNodeId)
            .limitToFirst(2000)

    Timber.d("loader recursion stations $dayTag/${stations.size}")

    fetchDayStations(query) { data, errorMessage ->
        if (errorMessage == LoadError.NotExist)
            callback(stations, null)
        else if (errorMessage != null)
            callback(emptyList(), errorMessage)
        else {
            stations.addAll(data)
            fetchDayStationsPaged(dayTag, data.last().nodeId, stations, callback)
        }
    }
}

private var loaderDisposable: Disposable? = null

private fun fetchDayStations(ref: Query, callback: (data: List<StationCloud>, errorMessage: LoadError?) -> Unit){

    loaderDisposable?.dispose()
    loaderDisposable = FirebaseHelper
        .dbReadAsSingle(ref)
        .subscribeOn(AndroidSchedulers.mainThread())
        .observeOn(AndroidSchedulers.mainThread())
        .doOnDispose {
            callback(emptyList(), LoadError.Cancelled)
        }
        .subscribe ({ snapshot ->
            if (snapshot.exists()) {
                val stations = mutableListOf<StationCloud>()
                snapshot.children.forEach { item ->
                    item.getValue(StationCloud::class.java)?.also { station ->
                        stations.add(station.copy(nodeId = item.key))
                    }
                }
                Timber.d("loader fetchDayStations stations size = ${stations.size}")
                callback(stations.toList(), null)
            } else
                callback(emptyList(), LoadError.NotExist)
        }, {
            Timber.e(it)
            callback(emptyList(), LoadError.CantGet)
        })
}

private fun dbReadAsSingle(ref: Query): Single<DataSnapshot> {
    return Single.create { emitter ->
        ref.get().addOnCompleteListener { task ->
            Timber.d("runQueryCloudFirst task succeed = ${task.isSuccessful}")
            if (task.isSuccessful && emitter.isDisposed.not()){
                Timber.d("runQueryCloudFirst children size = ${task.result.childrenCount}")
                emitter.onSuccess(task.result)
            } else
                task.exception?.also {
                    //todo check a bug: timeout exception doesn't work when offline
                    //
                    Timber.e(it, "runQueryCloudFirst")
                    emitter.onError(it)
                }
        }
    }
}

I need to get a large bundle of data (10MB) from RTDB. All data store into stations node. When I get all data in single request, sometimes my Android app crashes with OutOfMemory error. So I need use paginated reading, but I've got an issue with it.

Request of reading full data from stations takes approx 10-12 seconds in my case. But when I make several requests (pagination) reading 2MB for a page, each request takes also approx 10 seconds. As a result total time of getting stations data enlarged from 10 seconds (single read) to 50 seconds (paged read). Can I make pagination work faster? Thanks.

Structure:

Rules:

Pagination logic:

private fun fetchDayStationsPaged(dayTag: String, lastNodeId: String? = null, stations: MutableList<StationCloud> = mutableListOf(), callback: (data: List<StationCloud>, errorMessage: LoadError?) -> Unit){
    val path = String.format(TimelineManager.KEY_TIMELINE_STATIONS, dayTag)

    val query = if (lastNodeId == null)
        database
            .getReference(path)
            .orderByKey()
            .limitToFirst(2000) //1 node takes approx 1KB
    else
        database
            .getReference(path)
            .orderByKey()
            .startAfter(lastNodeId)
            .limitToFirst(2000)

    Timber.d("loader recursion stations $dayTag/${stations.size}")

    fetchDayStations(query) { data, errorMessage ->
        if (errorMessage == LoadError.NotExist)
            callback(stations, null)
        else if (errorMessage != null)
            callback(emptyList(), errorMessage)
        else {
            stations.addAll(data)
            fetchDayStationsPaged(dayTag, data.last().nodeId, stations, callback)
        }
    }
}

private var loaderDisposable: Disposable? = null

private fun fetchDayStations(ref: Query, callback: (data: List<StationCloud>, errorMessage: LoadError?) -> Unit){

    loaderDisposable?.dispose()
    loaderDisposable = FirebaseHelper
        .dbReadAsSingle(ref)
        .subscribeOn(AndroidSchedulers.mainThread())
        .observeOn(AndroidSchedulers.mainThread())
        .doOnDispose {
            callback(emptyList(), LoadError.Cancelled)
        }
        .subscribe ({ snapshot ->
            if (snapshot.exists()) {
                val stations = mutableListOf<StationCloud>()
                snapshot.children.forEach { item ->
                    item.getValue(StationCloud::class.java)?.also { station ->
                        stations.add(station.copy(nodeId = item.key))
                    }
                }
                Timber.d("loader fetchDayStations stations size = ${stations.size}")
                callback(stations.toList(), null)
            } else
                callback(emptyList(), LoadError.NotExist)
        }, {
            Timber.e(it)
            callback(emptyList(), LoadError.CantGet)
        })
}

private fun dbReadAsSingle(ref: Query): Single<DataSnapshot> {
    return Single.create { emitter ->
        ref.get().addOnCompleteListener { task ->
            Timber.d("runQueryCloudFirst task succeed = ${task.isSuccessful}")
            if (task.isSuccessful && emitter.isDisposed.not()){
                Timber.d("runQueryCloudFirst children size = ${task.result.childrenCount}")
                emitter.onSuccess(task.result)
            } else
                task.exception?.also {
                    //todo check a bug: timeout exception doesn't work when offline
                    //https://github.com/firebase/firebase-android-sdk/issues/5771
                    Timber.e(it, "runQueryCloudFirst")
                    emitter.onError(it)
                }
        }
    }
}
Share Improve this question edited yesterday Konstantin Konopko asked Dec 30, 2024 at 11:56 Konstantin KonopkoKonstantin Konopko 5,4084 gold badges38 silver badges66 bronze badges
Add a comment  | 

1 Answer 1

Reset to default 0

If I understand correctly, you are loading the data from Firebase's local cache. In that case the behavior makes sense, as Firebase actually has to read the full stations data from the local cache to then return a slice of that data. So it'll have to load the full data for each sliced, leading to the linear performance problem you have.

There is nothing within the Firebase Realtime Database API you can change about this. The best option typically is to change your data model to fit the use-case.

A good practice when dealing with NoSQL databases is to model your data for what you display on the screen, and to only load data that you actually display to the users.


In a mobile scenario it seems quite unlikely that you're displaying 5-7MB of data on a single screen. Even 10% of that would already be significantly more than average. So more likely, you're loading a lot of data that you're not actually, directly displaying. That's where your solution lies:

  • If you're only displaying a subset of each child node, consider creating a different top-level node that only contains the information you need for each child node. For example: station_list_info could contain just the few properties you actually display. That node will likely be much smaller, so lead to fewer memory problems.
  • If you're displaying aggregate values based on the child nodes, consider actually storing those aggregate values in the database - rather than calculating them on the (each) client. If you store the aggregate value like this, you'll update it whenever you create/update/delete a station - so writing the data becomes more involved. On the other hand, reading the aggregate value then becomes a trivial operation.

Both of these are common approaches when using a NoSQL database, where you typically end up with a data model that reflects your use-cases. To learn more about this, I recommend reading NoSQL data modeling techniques

本文标签: androidFirebase RTDB Read data using pagination is too slowStack Overflow