Skip to content

feat: implement serialization for InMemoryDocumentStore  #7887

@davidberenstein1957

Description

@davidberenstein1957

Is your feature request related to a problem? Please describe.
InMemoryDocumentStore is really nice for showcasing demos and it is relatively easy to implement a to_disk and from_disk method to make this easy. I wrote something custom and easy.

Describe the solution you'd like

# Copyright 2024-present, David Berenstein, Inc.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

import json
from pathlib import Path
from typing import Any, Dict

from haystack import Document
from haystack.document_stores.in_memory import InMemoryDocumentStore

class Database(InMemoryDocumentStore):
    def to_disk(self, path: str):
        """Write the database and its' data to disk as a JSON file."""
        data: Dict[str, Any] = self.to_dict()
        data["documents"] = [doc.to_dict(flatten=False) for doc in self.storage.values()]
        with open(path, "w") as f:
            json.dump(data, f)

    @classmethod
    def from_disk(cls, path: str) -> "Database":
        """Load the database and its' data from disk as a JSON file."""
        if Path(path).exists():
            try:
                with open(path, "r") as f:
                    data = json.load(f)
                cls_object = cls.from_dict(data)
                cls_object.write_documents([Document(**doc) for doc in data["documents"]])
                return cls_object
            except Exception as e:
                return cls()
        else:
            return cls()

Describe alternatives you've considered
N.A.

Additional context
N.A.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions