Yusuf Khalidd

Posted on Mar 19 • Originally published at apidog.com

إنشاء أكثر من 100 إعداد للوكيل باستخدام نماذج اللغة الكبيرة (LLMs) مع المعالجة المجمعة

مقدمة

قد يبدو إعداد مئات من وكلاء الذكاء الاصطناعي لمحاكاة وسائل التواصل الاجتماعي مهمة شاقة. يحتاج كل وكيل إلى جداول الأنشطة، وتكرار النشر، وتأخيرات الاستجابة، وأوزان التأثير، والمواقف. القيام بذلك يدويًا سيستغرق ساعات.

جرّب Apidog اليوم

MiroFish يقوم بأتمتة هذا من خلال توليد التكوين المدعوم بنماذج اللغة الكبيرة (LLM). يقوم النظام بتحليل مستنداتك، ورسم المعرفة، ومتطلبات المحاكاة، ثم يولد تكوينات مفصلة لكل وكيل.

التحديات العملية تشمل:

نماذج اللغة الكبيرة (LLMs) قد تفشل أحيانًا أو تُنتج مخرجات مقطوعة أو JSON غير صالح
حدود الرموز تؤثر على حجم البيانات الممكن تمريرها
الحاجة لإصلاح تلقائي أو يدوي للمخرجات

هذا الدليل يوضح التنفيذ العملي خطوة بخطوة:

التوليد التسلسلي (الوقت ← الأحداث ← الوكلاء ← المنصات)
دفعات بيانات لتفادي حدود السياق
إصلاح JSON تلقائيًا للمخرجات المقطوعة
fallback احتياطي قائم على القواعد عند فشل LLM
أنماط نشاط الوكلاء حسب النوع (طالب، مسؤول، إعلام)
منطق تحقق وتصحيح آلي

💡 نصيحة عملية: خط الأنابيب هذا يعالج أكثر من 100 وكيل عبر استدعاءات API متسلسلة. استخدم Apidog للتحقق من مخططات الطلب/الاستجابة في كل مرحلة، واكتشاف أخطاء JSON قبل الإنتاج، وتوليد سيناريوهات اختبار للحالات الحافة مثل مخرجات LLM الناقصة.

جميع الأكواد مأخوذة من الاستخدام الفعلي في MiroFish.

نظرة عامة على البنية

يعتمد مولد التكوين على خط أنابيب تسلسلي:

┌─────────────────┐     ┌─────────────────┐     ┌─────────────────┐
│   منشئ السياق    │ ──► │  مولد تكوين   │ ──► │  مولد تكوين   │
│                 │     │     الوقت      │     │      الحدث      │
│                 │     │                 │     │                 │
│ - متطلبات       │     │ - إجمالي الساعات│     │ - المنشورات الأولية│
│   المحاكاة       │     │ - الدقائق/جولة │     │ - المواضيع الساخنة  │
│ - ملخص الكيانات │     │ - ساعات الذروة  │     │ - اتجاه السرد    │
│ - نص الوثيقة    │     │ - معامل النشاط  │     │                 │
└─────────────────┘     └─────────────────┘     └─────────────────┘
                                                        │
                                                        ▼
┌─────────────────┐     ┌─────────────────┐     ┌─────────────────┐
│  تجميع التكوين   │ ◄── │   تكوين المنصة  │ ◄── │  دفعات تكوين   │
│      النهائي     │     │                 │     │     الوكلاء     │
│                 │     │                 │     │                 │
│ - دمج الكل       │     │ - معلمات تويتر │     │ - 15 وكيلاً    │
│ - التحقق         │     │ - معلمات ريديت │     │   لكل دفعة      │
│ - حفظ JSON       │     │ - عتبة الانتشار │    │ - عدد N من الدفعات│
└─────────────────┘     └─────────────────┘     └─────────────────┘

هيكل الملفات

backend/app/services/
├── simulation_config_generator.py  # المنطق الرئيسي للتوليد
├── ontology_generator.py           # توليد الأونطولوجيا
└── zep_entity_reader.py            # تصفية الكيانات

backend/app/models/
├── task.py                         # تتبع المهام
└── project.py                      # حالة المشروع

استراتيجية التوليد خطوة بخطوة

توليد التكوينات دفعة واحدة يتجاوز حدود الرموز، لذا استخدم التوليد المرحلي. كل دفعة تغطي 15 وكيلًا.

class SimulationConfigGenerator:
    AGENTS_PER_BATCH = 15
    MAX_CONTEXT_LENGTH = 50000
    TIME_CONFIG_CONTEXT_LENGTH = 10000
    EVENT_CONFIG_CONTEXT_LENGTH = 8000
    ENTITY_SUMMARY_LENGTH = 300
    AGENT_SUMMARY_LENGTH = 300
    ENTITIES_PER_TYPE_DISPLAY = 20

    def generate_config(
        self,
        simulation_id: str,
        project_id: str,
        graph_id: str,
        simulation_requirement: str,
        document_text: str,
        entities: List[EntityNode],
        enable_twitter: bool = True,
        enable_reddit: bool = True,
        progress_callback: Optional[Callable[[int, int, str], None]] = None,
    ) -> SimulationParameters:

        num_batches = math.ceil(len(entities) / self.AGENTS_PER_BATCH)
        total_steps = 3 + num_batches
        current_step = 0

        def report_progress(step: int, message: str):
            nonlocal current_step
            current_step = step
            if progress_callback:
                progress_callback(step, total_steps, message)
            logger.info(f"[{step}/{total_steps}] {message}")

        context = self._build_context(
            simulation_requirement=simulation_requirement,
            document_text=document_text,
            entities=entities
        )

        reasoning_parts = []

        # Step 1: time config
        report_progress(1, "Generating time configuration...")
        time_config_result = self._generate_time_config(context, len(entities))
        time_config = self._parse_time_config(time_config_result, len(entities))
        reasoning_parts.append(f"Time config: {time_config_result.get('reasoning', 'Success')}")

        # Step 2: event config
        report_progress(2, "Generating event config and hot topics...")
        event_config_result = self._generate_event_config(context, simulation_requirement, entities)
        event_config = self._parse_event_config(event_config_result)
        reasoning_parts.append(f"Event config: {event_config_result.get('reasoning', 'Success')}")

        # Steps 3-N: agent configs in batches
        all_agent_configs = []
        for batch_idx in range(num_batches):
            start_idx = batch_idx * self.AGENTS_PER_BATCH
            end_idx = min(start_idx + self.AGENTS_PER_BATCH, len(entities))
            batch_entities = entities[start_idx:end_idx]

            report_progress(
                3 + batch_idx,
                f"Generating agent config ({start_idx + 1}-{end_idx}/{len(entities)})..."
            )

            batch_configs = self._generate_agent_configs_batch(
                context=context,
                entities=batch_entities,
                start_idx=start_idx,
                simulation_requirement=simulation_requirement
            )
            all_agent_configs.extend(batch_configs)

        reasoning_parts.append(f"Agent config: Generated {len(all_agent_configs)} agents")

        # Assign initial post publishers
        event_config = self._assign_initial_post_agents(event_config, all_agent_configs)

        # Final: platform config
        report_progress(total_steps, "Generating platform configuration...")
        twitter_config = PlatformConfig(platform="twitter", ...) if enable_twitter else None
        reddit_config = PlatformConfig(platform="reddit", ...) if enable_reddit else None

        params = SimulationParameters(
            simulation_id=simulation_id,
            project_id=project_id,
            graph_id=graph_id,
            simulation_requirement=simulation_requirement,
            time_config=time_config,
            agent_configs=all_agent_configs,
            event_config=event_config,
            twitter_config=twitter_config,
            reddit_config=reddit_config,
            generation_reasoning=" | ".join(reasoning_parts)
        )

        return params

مزايا هذا النهج:

كل استدعاء LLM يكون مركزًا ويسهل التحكم به.
المستخدم يحصل على تحديثات تقدم واقعية.
أي فشل في مرحلة لا يعطل بقية العملية.

بناء السياق

اجمع المعلومات ذات الصلة بالحدود الآمنة للسياق:

def _build_context(
    self,
    simulation_requirement: str,
    document_text: str,
    entities: List[EntityNode]
) -> str:

    entity_summary = self._summarize_entities(entities)

    context_parts = [
        f"## متطلبات المحاكاة\n{simulation_requirement}",
        f"\n## معلومات الكيان ({len(entities)} كيانات)\n{entity_summary}",
    ]

    current_length = sum(len(p) for p in context_parts)
    remaining_length = self.MAX_CONTEXT_LENGTH - current_length - 500

    if remaining_length > 0 and document_text:
        doc_text = document_text[:remaining_length]
        if len(document_text) > remaining_length:
            doc_text += "\n...(تم اقتطاع المستند)"
        context_parts.append(f"\n## المستند الأصلي\n{doc_text}")

    return "\n".join(context_parts)

تلخيص الكيانات

الكود التالي يلخص الكيانات حسب النوع ويوفر عينة من كل نوع فقط:

def _summarize_entities(self, entities: List[EntityNode]) -> str:
    lines = []
    by_type: Dict[str, List[EntityNode]] = {}
    for e in entities:
        t = e.get_entity_type() or "Unknown"
        by_type.setdefault(t, []).append(e)

    for entity_type, type_entities in by_type.items():
        lines.append(f"\n### {entity_type} ({len(type_entities)} كيانات)")
        display_count = self.ENTITIES_PER_TYPE_DISPLAY
        summary_len = self.ENTITY_SUMMARY_LENGTH

        for e in type_entities[:display_count]:
            summary_preview = (e.summary[:summary_len] + "...") if len(e.summary) > summary_len else e.summary
            lines.append(f"- {e.name}: {summary_preview}")

        if len(type_entities) > display_count:
            lines.append(f"  ... و {len(type_entities) - display_count} آخرين")

    return "\n".join(lines)

مثال مخرجات:

### طالب (45 كياناً)
- تشانغ وي: نشط في اتحاد الطلاب، ينشر كثيراً عن فعاليات الحرم الجامعي والضغط الأكاديمي...
- لي مينغ: طالب دراسات عليا يبحث في أخلاقيات الذكاء الاصطناعي، يشارك غالباً أخبار التكنولوجيا...
... و 43 آخرين

### جامعة (3 كيانات)
- جامعة ووهان: حساب رسمي، ينشر إعلانات وأخبار...

توليد تكوين الوقت

حدد مدة المحاكاة وأنماط النشاط:

def _generate_time_config(self, context: str, num_entities: int) -> Dict[str, Any]:
    context_truncated = context[:self.TIME_CONFIG_CONTEXT_LENGTH]
    max_agents_allowed = max(1, int(num_entities * 0.9))

    prompt = f"""بناءً على متطلبات المحاكاة التالية، قم بإنشاء تكوين الوقت.
{context_truncated}
## المهمة
قم بإنشاء JSON لتكوين الوقت.
# ... تفاصيل الحقول كما في الشرح السابق ...
"""
    system_prompt = "أنت خبير في محاكاة وسائل التواصل الاجتماعي. أعد تنسيق JSON خالص."

    try:
        return self._call_llm_with_retry(prompt, system_prompt)
    except Exception as e:
        logger.warning(f"فشل توليد LLM لتكوين الوقت: {e}، باستخدام الإعداد الافتراضي")
        return self._get_default_time_config(num_entities)

التحقق من تكوين الوقت

def _parse_time_config(self, result: Dict[str, Any], num_entities: int) -> TimeSimulationConfig:
    agents_per_hour_min = result.get("agents_per_hour_min", max(1, num_entities // 15))
    agents_per_hour_max = result.get("agents_per_hour_max", max(5, num_entities // 5))

    if agents_per_hour_min > num_entities:
        logger.warning(f"agents_per_hour_min ({agents_per_hour_min}) يتجاوز إجمالي الوكلاء ({num_entities})، تم تصحيحه")
        agents_per_hour_min = max(1, num_entities // 10)

    if agents_per_hour_max > num_entities:
        logger.warning(f"agents_per_hour_max ({agents_per_hour_max}) يتجاوز إجمالي الوكلاء ({num_entities})، تم تصحيحه")
        agents_per_hour_max = max(agents_per_hour_min + 1, num_entities // 2)

    if agents_per_hour_min >= agents_per_hour_max:
        agents_per_hour_min = max(1, agents_per_hour_max // 2)
        logger.warning(f"agents_per_hour_min >= max، تم تصحيحه إلى {agents_per_hour_min}")

    return TimeSimulationConfig(
        total_simulation_hours=result.get("total_simulation_hours", 72),
        minutes_per_round=result.get("minutes_per_round", 60),
        agents_per_hour_min=agents_per_hour_min,
        agents_per_hour_max=agents_per_hour_max,
        peak_hours=result.get("peak_hours", [19, 20, 21, 22]),
        off_peak_hours=result.get("off_peak_hours", [0, 1, 2, 3, 4, 5]),
        off_peak_activity_multiplier=0.05,
        morning_activity_multiplier=0.4,
        work_activity_multiplier=0.7,
        peak_activity_multiplier=1.5
    )

تكوين الوقت الافتراضي

def _get_default_time_config(self, num_entities: int) -> Dict[str, Any]:
    return {
        "total_simulation_hours": 72,
        "minutes_per_round": 60,
        "agents_per_hour_min": max(1, num_entities // 15),
        "agents_per_hour_max": max(5, num_entities // 5),
        "peak_hours": [19, 20, 21, 22],
        "off_peak_hours": [0, 1, 2, 3, 4, 5],
        "morning_hours": [6, 7, 8],
        "work_hours": [9, 10, 11, 12, 13, 14, 15, 16, 17, 18],
        "reasoning": "باستخدام تكوين المنطقة الزمنية الصينية الافتراضي"
    }

توليد تكوين الأحداث

توليد منشورات أولية ومواضيع ساخنة واتجاه السرد:

def _generate_event_config(
    self,
    context: str,
    simulation_requirement: str,
    entities: List[EntityNode]
) -> Dict[str, Any]:

    entity_types_available = list(set(
        e.get_entity_type() or "Unknown" for e in entities
    ))

    type_examples = {}
    for e in entities:
        etype = e.get_entity_type() or "Unknown"
        type_examples.setdefault(etype, [])
        if len(type_examples[etype]) < 3:
            type_examples[etype].append(e.name)

    type_info = "\n".join([
        f"- {t}: {', '.join(examples)}"
        for t, examples in type_examples.items()
    ])

    context_truncated = context[:self.EVENT_CONFIG_CONTEXT_LENGTH]

    prompt = f"""بناءً على متطلبات المحاكاة التالية، قم بإنشاء تكوين الحدث.
# ... تفاصيل الحقول كما في الشرح السابق ...
"""
    system_prompt = "أنت خبير في تحليل الرأي. أعد تنسيق JSON خالص."

    try:
        return self._call_llm_with_retry(prompt, system_prompt)
    except Exception as e:
        logger.warning(f"فشل توليد LLM لتكوين الحدث: {e}، باستخدام الإعداد الافتراضي")
        return {
            "hot_topics": [],
            "narrative_direction": "",
            "initial_posts": [],
            "reasoning": "باستخدام التكوين الافتراضي"
        }

تعيين ناشري المنشورات الأولية

بعد توليد المنشورات الأولية، قم بتعيينهم تلقائيًا على وكلاء حقيقيين:

def _assign_initial_post_agents(
    self,
    event_config: EventConfig,
    agent_configs: List[AgentActivityConfig]
) -> EventConfig:

    if not event_config.initial_posts:
        return event_config

    agents_by_type: Dict[str, List[AgentActivityConfig]] = {}
    for agent in agent_configs:
        etype = agent.entity_type.lower()
        agents_by_type.setdefault(etype, []).append(agent)

    type_aliases = {
        "official": ["official", "university", "governmentagency", "government"],
        "university": ["university", "official"],
        "mediaoutlet": ["mediaoutlet", "media"],
        "student": ["student", "person"],
        "professor": ["professor", "expert", "teacher"],
        "alumni": ["alumni", "person"],
        "organization": ["organization", "ngo", "company", "group"],
        "person": ["person", "student", "alumni"],
    }

    used_indices: Dict[str, int] = {}

    updated_posts = []
    for post in event_config.initial_posts:
        poster_type = post.get("poster_type", "").lower()
        content = post.get("content", "")

        matched_agent_id = None

        if poster_type in agents_by_type:
            agents = agents_by_type[poster_type]
            idx = used_indices.get(poster_type, 0) % len(agents)
            matched_agent_id = agents[idx].agent_id
            used_indices[poster_type] = idx + 1
        else:
            for alias_key, aliases in type_aliases.items():
                if poster_type in aliases or alias_key == poster_type:
                    for alias in aliases:
                        if alias in agents_by_type:
                            agents = agents_by_type[alias]
                            idx = used_indices.get(alias, 0) % len(agents)
                            matched_agent_id = agents[idx].agent_id
                            used_indices[alias] = idx + 1
                            break
                    if matched_agent_id is not None:
                        break

        if matched_agent_id is None:
            logger.warning(f"لا يوجد وكيل مطابق للنوع '{poster_type}'، باستخدام الوكيل ذو التأثير الأعلى")
            if agent_configs:
                sorted_agents = sorted(agent_configs, key=lambda a: a.influence_weight, reverse=True)
                matched_agent_id = sorted_agents[0].agent_id
            else:
                matched_agent_id = 0

        updated_posts.append({
            "content": content,
            "poster_type": post.get("poster_type", "Unknown"),
            "poster_agent_id": matched_agent_id
        })

        logger.info(f"تعيين المنشور الأولي: poster_type='{poster_type}' -> agent_id={matched_agent_id}")

    event_config.initial_posts = updated_posts
    return event_config

توليد تكوين الوكيل الدفعي

توليد دفعات 15 وكيل دفعة واحدة للالتزام بحدود LLM:

def _generate_agent_configs_batch(
    self,
    context: str,
    entities: List[EntityNode],
    start_idx: int,
    simulation_requirement: str
) -> List[AgentActivityConfig]:

    entity_list = []
    summary_len = self.AGENT_SUMMARY_LENGTH
    for i, e in enumerate(entities):
        entity_list.append({
            "agent_id": start_idx + i,
            "entity_name": e.name,
            "entity_type": e.get_entity_type() or "Unknown",
            "summary": e.summary[:summary_len] if e.summary else ""
        })

    prompt = f"""بناءً على المعلومات التالية، قم بإنشاء تكوين نشاط وسائل التواصل الاجتماعي لكل كيان.
متطلبات المحاكاة: {simulation_requirement}
## قائمة الكيانات

json
{json.dumps(entity_list, ensure_ascii=False, indent=2)}

"""

    system_prompt = "أنت خبير في تحليل سلوك وسائل التواصل الاجتماعي. أعد تنسيق JSON خالص."

    try:
        result = self._call_llm_with_retry(prompt, system_prompt)
        llm_configs = {cfg["agent_id"]: cfg for cfg in result.get("agent_configs", [])}
    except Exception as e:
        logger.warning(f"فشل توليد LLM للدفعة الخاصة بتكوين الوكيل: {e}، باستخدام توليد قائم على القواعد")
        llm_configs = {}

    configs = []
    for i, entity in enumerate(entities):
        agent_id = start_idx + i
        cfg = llm_configs.get(agent_id, {})

        if not cfg:
            cfg = self._generate_agent_config_by_rule(entity)

        config = AgentActivityConfig(
            agent_id=agent_id,
            entity_uuid=entity.uuid,
            entity_name=entity.name,
            entity_type=entity.get_entity_type() or "Unknown",
            activity_level=cfg.get("activity_level", 0.5),
            posts_per_hour=cfg.get("posts_per_hour", 0.5),
            comments_per_hour=cfg.get("comments_per_hour", 1.0),
            active_hours=cfg.get("active_hours", list(range(9, 23))),
            response_delay_min=cfg.get("response_delay_min", 5),
            response_delay_max=cfg.get("response_delay_max", 60),
            sentiment_bias=cfg.get("sentiment_bias", 0.0),
            stance=cfg.get("stance", "neutral"),
            influence_weight=cfg.get("influence_weight", 1.0)
        )
        configs.append(config)

    return configs

python

fallback احتياطي قائم على القواعد

def _generate_agent_config_by_rule(self, entity: EntityNode) -> Dict[str, Any]:
    entity_type = (entity.get_entity_type() or "Unknown").lower()

    if entity_type in ["university", "governmentagency", "ngo"]:
        return {
            "activity_level": 0.2,
            "posts_per_hour": 0.1,
            "comments_per_hour": 0.05,
            "active_hours": list(range(9, 18)),
            "response_delay_min": 60,
            "response_delay_max": 240,
            "sentiment_bias": 0.0,
            "stance": "محايد",
            "influence_weight": 3.0
        }
    elif entity_type in ["mediaoutlet"]:
        return {
            "activity_level": 0.5,
            "posts_per_hour": 0.8,
            "comments_per_hour": 0.3,
            "active_hours": list(range(7, 24)),
            "response_delay_min": 5,
            "response_delay_max": 30,
            "sentiment_bias": 0.0,
            "stance": "مراقب",
            "influence_weight": 2.5
        }
    elif entity_type in ["professor", "expert", "official"]:
        return {
            "activity_level": 0.4,
            "posts_per_hour": 0.3,
            "comments_per_hour": 0.5,
            "active_hours": list(range(8, 22)),
            "response_delay_min": 15,
            "response_delay_max": 90,
            "sentiment_bias": 0.0,
            "stance": "محايد",
            "influence_weight": 2.0
        }
    elif entity_type in ["student"]:
        return {
            "activity_level": 0.8,
            "posts_per_hour": 0.6,
            "comments_per_hour": 1.5,
            "active_hours": [8, 9, 10, 11, 12, 13, 18, 19, 20, 21, 22, 23],
            "response_delay_min": 1,
            "response_delay_max": 15,
            "sentiment_bias": 0.0,
            "stance": "محايد",
            "influence_weight": 0.8
        }
    elif entity_type in ["alumni"]:
        return {
            "activity_level": 0.6,
            "posts_per_hour": 0.4,
            "comments_per_hour": 0.8,
            "active_hours": [12, 13, 19, 20, 21, 22, 23],
            "response_delay_min": 5,
            "response_delay_max": 30,
            "sentiment_bias": 0.0,
            "stance": "محايد",
            "influence_weight": 1.0
        }
    else:
        return {
            "activity_level": 0.7,
            "posts_per_hour": 0.5,
            "comments_per_hour": 1.2,
            "active_hours": [9, 10, 11, 12, 13, 18, 19, 20, 21, 22, 23],
            "response_delay_min": 2,
            "response_delay_max": 20,
            "sentiment_bias": 0.0,
            "stance": "محايد",
            "influence_weight": 1.0
        }

استدعاء LLM مع إعادة المحاولة وإصلاح JSON

استدعاءات LLM قد تُنتج مخرجات ناقصة أو JSON غير صالح. استخدم هذا النمط للتعامل مع جميع الحالات:

def _call_llm_with_retry(self, prompt: str, system_prompt: str) -> Dict[str, Any]:
    import re

    max_attempts = 3
    last_error = None

    for attempt in range(max_attempts):
        try:
            response = self.client.chat.completions.create(
                model=self.model_name,
                messages=[
                    {"role": "system", "content": system_prompt},
                    {"role": "user", "content": prompt}
                ],
                response_format={"type": "json_object"},
                temperature=0.7 - (attempt * 0.1)
            )

            content = response.choices[0].message.content
            finish_reason = response.choices[0].finish_reason

            if finish_reason == 'length':
                logger.warning(f"تم اقتطاع إخراج LLM (المحاولة {attempt+1})")
                content = self._fix_truncated_json(content)

            try:
                return json.loads(content)
            except json.JSONDecodeError as e:
                logger.warning(f"فشل تحليل JSON (المحاولة {attempt+1}): {str(e)[:80]}")

                fixed = self._try_fix_config_json(content)
                if fixed:
                    return fixed

                last_error = e

        except Exception as e:
            logger.warning(f"فشل استدعاء LLM (المحاولة {attempt+1}): {str(e)[:80]}")
            last_error = e
            import time
            time.sleep(2 * (attempt + 1))

    raise last_error or Exception("فشل استدعاء LLM")

إصلاح JSON المقتطع

def _fix_truncated_json(self, content: str) -> str:
    content = content.strip()
    open_braces = content.count('{') - content.count('}')
    open_brackets = content.count('[') - content.count(']')

    if content and content[-1] not in '",}]':
        content += '"'

    content += ']' * open_brackets
    content += '}' * open_braces

    return content

إصلاح JSON المتقدم

def _try_fix_config_json(self, content: str) -> Optional[Dict[str, Any]]:
    import re

    content = self._fix_truncated_json(content)

    json_match = re.search(r'\{[\s\S]*\}', content)
    if json_match:
        json_str = json_match.group()

        def fix_string(match):
            s = match.group(0)
            s = s.replace('\n', ' ').replace('\r', ' ')
            s = re.sub(r'\s+', ' ', s)
            return s

        json_str = re.sub(r'"[^"\\]*(?:\\.[^"\\]*)*"', fix_string, json_str)

        try:
            return json.loads(json_str)
        except:
            json_str = re.sub(r'[\x00-\x1f\x7f-\x9f]', ' ', json_str)
            json_str = re.sub(r'\s+', ' ', json_str)
            try:
                return json.loads(json_str)
            except:
                pass

    return None

هياكل بيانات التكوين

تكوين نشاط الوكيل

@dataclass
class AgentActivityConfig:
    """تكوين نشاط وكيل واحد"""
    agent_id: int
    entity_uuid: str
    entity_name: str
    entity_type: str

    activity_level: float = 0.5
    posts_per_hour: float = 1.0
    comments_per_hour: float = 2.0
    active_hours: List[int] = field(default_factory=lambda: list(range(8, 23)))
    response_delay_min: int = 5
    response_delay_max: int = 60
    sentiment_bias: float = 0.0
    stance: str = "محايد"
    influence_weight: float = 1.0

تكوين محاكاة الوقت

@dataclass
class TimeSimulationConfig:
    """تكوين محاكاة الوقت (المنطقة الزمنية الصينية)"""
    total_simulation_hours: int = 72
    minutes_per_round: int = 60
    agents_per_hour_min: int = 5
    agents_per_hour_max: int = 20
    peak_hours: List[int] = field(default_factory=lambda: [19, 20, 21, 22])
    peak_activity_multiplier: float = 1.5
    off_peak_hours: List[int] = field(default_factory=lambda: [0, 1, 2, 3, 4, 5])
    off_peak_activity_multiplier: float = 0.05
    morning_hours: List[int] = field(default_factory=lambda: [6, 7, 8])
    morning_activity_multiplier: float = 0.4
    work_hours: List[int] = field(default_factory=lambda: [9, 10, 11, 12, 13, 14, 15, 16, 17, 18])
    work_activity_multiplier: float = 0.7

معلمات المحاكاة الكاملة

@dataclass
class SimulationParameters:
    """تكوين معلمات المحاكاة الكاملة"""
    simulation_id: str
    project_id: str
    graph_id: str
    simulation_requirement: str

    time_config: TimeSimulationConfig = field(default_factory=TimeSimulationConfig)
    agent_configs: List[AgentActivityConfig] = field(default_factory=list)
    event_config: EventConfig = field(default_factory=EventConfig)
    twitter_config: Optional[PlatformConfig] = None
    reddit_config: Optional[PlatformConfig] = None

    llm_model: str = ""
    llm_base_url: str = ""

    generated_at: str = field(default_factory=lambda: datetime.now().isoformat())
    generation_reasoning: str = ""

    def to_dict(self) -> Dict[str, Any]:
        time_dict = asdict(self.time_config)
        return {
            "simulation_id": self.simulation_id,
            "project_id": self.project_id,
            "graph_id": self.graph_id,
            "simulation_requirement": self.simulation_requirement,
            "time_config": time_dict,
            "agent_configs": [asdict(a) for a in self.agent_configs],
            "event_config": asdict(self.event_config),
            "twitter_config": asdict(self.twitter_config) if self.twitter_config else None,
            "reddit_config": asdict(self.reddit_config) if self.reddit_config else None,
            "llm_model": self.llm_model,
            "llm_base_url": self.llm_base_url,
            "generated_at": self.generated_at,
            "generation_reasoning": self.generation_reasoning,
        }

جدول الملخص: أنماط أنواع الوكلاء

نوع الوكيل	النشاط	ساعات النشاط	منشورات/ساعة	تعليقات/ساعة	الاستجابة (دقيقة)	التأثير
الجامعة	0.2	9-17	0.1	0.05	60-240	3.0
الوكالة الحكومية	0.2	9-17	0.1	0.05	60-240	3.0
وسائل الإعلام	0.5	7-23	0.8	0.3	5-30	2.5
الأستاذ	0.4	8-21	0.3	0.5	15-90	2.0
الطالب	0.8	8-12, 18-23	0.6	1.5	1-15	0.8
الخريج	0.6	12-13, 19-23	0.4	0.8	5-30	1.0
شخص (افتراضي)	0.7	9-13, 18-23	0.5	1.2	2-20	1.0

الخلاصة

للحصول على خط أنابيب تكوين محاكاة مدعوم بـ LLM قابل للإنتاج، اتبع هذه الخطوات العملية:

توليد تسلسلي: قسم التكوين إلى مراحل (الوقت → الأحداث → الوكلاء → المنصات)
دفعات صغيرة: عالج 15 وكيلًا لكل دفعة لتجنب حدود السياق
إصلاح JSON: استخدم إصلاح تلقائي للأقواس والسلاسل عند الحاجة
fallback احتياطي: استخدم تكوينات قائمة على القواعد عند فشل LLM
أنماط حسب النوع: خصص أنماط النشاط لكل نوع وكيل
تحقق وتصحيح: راجع القيم الناتجة وتأكد من منطقها (مثل agents_per_hour ≤ total_agents)

استخدم هذه الأنماط لتسريع تطويرك، وضمان جودة التكوينات المدعومة بالنماذج اللغوية الكبيرة.

DEV Community