2025-08-24 09:13:30 +05:30
2025-08-24 09:03:38 +05:30
2025-08-23 20:21:57 -07:00
2025-08-23 20:21:57 -07:00
2025-08-24 09:03:38 +05:30
2025-08-24 09:13:30 +05:30

ThinkMesh

ThinkMesh is a python library for running diverse reasoning paths in parallel, scoring them with internal confidence signals, reallocates compute to promising branches, and fuses outcomes with verifiers and reducers. It works with offline Hugging Face Transformers and vLLM/TGI, and with hosted APIs.

Note: This is still in it's early development phase and breaking changes can sometimes occur

Highlights

  • Parallel reasoning with DeepConfstyle confidence gating and budget reallocation
  • Offlinefirst with Transformers; optional vLLM/TGI for serverside batching
  • Hosted adapters for OpenAI and Anthropic
  • Async execution with dynamic microbatches
  • Reducers (majority/judge) and pluggable verifiers (regex/numeric/custom)
  • Caching, metrics, and JSON traces

Install

git clone https://github.com/martianlantern/thinkmesh.git
cd thinkmesh
pip install -e ".[dev,transformers]"

Quickstart: Offline DeepConf

from thinkmesh import think, ThinkConfig, ModelSpec, StrategySpec

cfg = ThinkConfig(
  model=ModelSpec(backend="transformers", model_name="Qwen2.5-7B-Instruct",
                  max_tokens=256, temperature=0.7, seed=42, extra={"device":"cuda:0"}),
  strategy=StrategySpec(name="deepconf", parallel=8, max_steps=2,
                        deepconf={"k":5,"tau_low":-1.25,"tau_ent":2.2,"realloc_top_p":0.4}),
  reducer={"name":"majority"},
  budgets={"wall_clock_s":20,"tokens":4000},
)
ans = think("Show that the product of any three consecutive integers is divisible by 3.", cfg)
print(ans.content, ans.confidence)

Quickstart: OpenAI SelfConsistency

import os
os.environ["OPENAI_API_KEY"] = "sk-..."
from thinkmesh import think, ThinkConfig, ModelSpec, StrategySpec
cfg = ThinkConfig(
  model=ModelSpec(backend="openai", model_name="gpt-4o-mini", max_tokens=256, temperature=0.6),
  strategy=StrategySpec(name="self_consistency", parallel=6, max_steps=1),
  reducer={"name":"majority"},
  budgets={"wall_clock_s":15,"tokens":3000},
)
print(think("List three creative uses for a paperclip.", cfg).content)

CLI

thinkmesh think -m Qwen2.5-7B-Instruct --backend transformers --strategy deepconf "What is 37*43?"

Examples

Debate Strategy (hosted)

from thinkmesh import think, ThinkConfig, ModelSpec, StrategySpec
cfg = ThinkConfig(
  model=ModelSpec(backend="openai", model_name="gpt-4o-mini", max_tokens=256, temperature=0.7),
  strategy=StrategySpec(name="debate", parallel=4, max_steps=2, debate={"rounds":2}),
  reducer={"name":"judge"},
  budgets={"wall_clock_s":25,"tokens":5000},
)
print(think("Argue whether every even integer > 2 is the sum of two primes.", cfg).content)

vLLM Local Server

from thinkmesh import think, ThinkConfig, ModelSpec, StrategySpec
cfg = ThinkConfig(
  model=ModelSpec(backend="vllm", model_name="Qwen2.5-7B-Instruct",
                  max_tokens=256, temperature=0.7, extra={"base_url":"http://localhost:8000/v1","api_key":"sk-"}),
  strategy=StrategySpec(name="deepconf", parallel=8, max_steps=2, deepconf={"k":5}),
  reducer={"name":"majority"},
  budgets={"wall_clock_s":20,"tokens":4000},
)
print(think("Give a constructive proof for the Pigeonhole Principle on a simple case.", cfg).content)

Custom Verifier

from thinkmesh import think, ThinkConfig, ModelSpec, StrategySpec
cfg = ThinkConfig(
  model=ModelSpec(backend="transformers", model_name="Qwen2.5-7B-Instruct", max_tokens=128),
  strategy=StrategySpec(name="self_consistency", parallel=5, max_steps=1),
  reducer={"name":"majority"},
  verifier={"type":"regex","pattern":r"Final Answer\s*:\s*.+$"},
  budgets={"wall_clock_s":10,"tokens":1500},
)
print(think("Answer with 'Final Answer: <value>' for 19*21.", cfg).content)

Tree Of Thought (offline)

from thinkmesh import think, ThinkConfig, ModelSpec, StrategySpec
cfg = ThinkConfig(
  model=ModelSpec(backend="transformers", model_name="Qwen2.5-7B-Instruct", max_tokens=192),
  strategy=StrategySpec(name="tree", parallel=6, max_steps=2, tree={"branches":3,"depth":2}),
  reducer={"name":"majority"},
  budgets={"wall_clock_s":20,"tokens":3500},
)
print(think("Sketch a plan to prove that sqrt(2) is irrational.", cfg).content)

Traces, Metrics, Caching

Traces are emitted as JSON graphs inside the returned structure. Prometheus metrics and OpenTelemetry spans can be enabled via config extras. A local disk cache deduplicates repeated generations by hashing adapter, model, prompt, and params.

Extending

  • Implement a new backend by providing a Thinker.generate method that returns token text and optional token logprobs
  • Add a new strategy by wiring a function in thinkmesh/strategies and registering by name
  • Add reducers/verifiers under thinkmesh/reduce

License

MIT

References

@misc{deepconf2025,
  title         = {DeepConf: Deep Think with Confidence},
  year          = {2025},
  howpublished  = {\url{https://jiaweizzhao.github.io/deepconf/}}
}

@misc{wang2022selfconsistency,
  title         = {Self-Consistency Improves Chain-of-Thought Reasoning in Language Models},
  author        = {Wang, Xuezhi and Wei, Jason and others},
  year          = {2022},
  eprint        = {2203.11171},
  archivePrefix = {arXiv},
  primaryClass  = {cs.CL}
}

@misc{yao2023tree,
  title         = {Tree of Thoughts: Deliberate Problem Solving with Large Language Models},
  author        = {Yao, Shunyu and others},
  year          = {2023},
  eprint        = {2305.10601},
  archivePrefix = {arXiv},
  primaryClass  = {cs.AI}
}

Citation

If you use this library in your work, please cite:

@software{thinkmesh2025,
  title        = {ThinkMesh: Parallel Thinking for LLMs},
  author       = {martianlantern},
  year         = {2025},
  note         = {Version 0.1.1},
}
Description
Parallel thinking for LLMs. Confidence‑gated, strategy‑driven, offline‑friendly
Readme MIT 44 KiB
Languages
Python 100%