用 Redis 存储 ID 连续的数据

用 Redis 存储 ID 连续的数据

2014 8 11 02:14 AM 2339次查看

分类：数据库标签：Python, Redis

之前在设计「Doodle 2」和开发「知乎日报」时，我面对最多的数据类型就是带 ID 的数据了。
在使用关系型数据库时，自增的主键可以满足这个需求，而在 Redis 中就稍微麻烦些了。

一直以来我都额外使用了一个计数器来存储 ID，例如（省略了很多代码，不影响阅读）：

class IDModel(JSONModel):
    id = IntegerProperty()

    @classmethod
    def get_by_id(cls, entity_id):
        json_content = cls.redis_client.hget(cls._KEY, entity_id)
        if json_content:
            return cls.from_json(json_content)

    @classmethod
    def get_by_ids(cls, ids):
        if not ids:
            return []

        results = cls.redis_client.hmget(cls._KEY, ids)

        return [cls.from_json(json_content)
                for json_content in results]

    @classmethod
    def get_next_id(cls):
        return MAX_ID.get_next_id(cls._KEY)

    def save(self):
        self._populate_default_attributes()
        self._save_self()

    def _populate_default_attributes(self):
        if self.id is None:
            self.id = self.get_next_id()

    def _save_self(self):
        self.redis_client.hset(self._KEY, self.id, self.to_json())


class MAX_ID(SimpleModel):
    @classmethod
    def get_next_id(cls, for_type, increment=1):
        return cls.redis_client.hincrby(cls._KEY, for_type, increment)

    @classmethod
    def get_max_id(cls, for_type):
        return int(cls.redis_client.hget(cls._KEY, for_type) or 0)

这种方式的优点是简单，比较适合存储 ID 不连续的数据；缺点是使用了 hash，比较占内存。

刚才又想到一种方法，直接把数据存到 list 里，然后依靠其长度来定 ID，例如：

class IDModel(JSONModel):
    id = IntegerProperty()

    @classmethod
    def get_by_id(cls, entity_id):
        json_content = cls.redis_client.lindex(cls._KEY, entity_id)
        if json_content:
            return cls.from_json(json_content)

    @classmethod
    def get_by_ids(cls, ids):
        if not ids:
            return []

        key = cls._KEY
        pipe = cls.redis_client.pipeline(transaction=False)
        for entity_id in ids:
            pipe.lindex(key, entity_id)
        results = pipe.execute()

        return [cls.from_json(json_content)
                for json_content in results]

    def save(self):
        key = self._KEY

        if self.id is None:
            with self.redis_client.pipeline() as pipe:
                try:
                    pipe.watch(key)
                    self.id = pipe.llen(key) + 1
                    pipe.multi()
                    pipe.rpush(key, self.to_json())
                    pipe.execute()
                except Exception:
                    self.id = None
                    raise
        else:
            self.redis_client.lset(key, self.id, self.to_json())

缺点大致有如下三点：

ID 必须连续
依赖事务，导致 _populate_default_attributes() 的逻辑不好分离
获取多个 ID 对应的实体需要执行多条语句

不过在省内存这个优点面前，其他都是浮云……

7条评论你不来一发么↓ 顺序排列倒序排列

向下滚动可载入更多评论，或者点这里禁止自动加载。

想说点什么呢？