Commit Graph

1552 Commits

Author SHA1 Message Date
Alex Cheema
1d5c28aed4 (partially) restore exo node equality by forwarding prompts to the dynamically selected head 2024-07-15 22:59:00 -07:00
Alex Cheema
1ec92b731e README notice about api endpoint 2024-07-15 22:46:06 -07:00
Alex Cheema
108a904ab3 Readme whitespace 2024-07-15 22:36:07 -07:00
Alex Cheema
e17905e295 add global reset 2024-07-15 22:35:54 -07:00
Alex Cheema
de9b89ea29 readme typo 2024-07-15 18:22:42 -07:00
Alex Cheema
199eeb03db known issues section in readme 2024-07-15 18:22:09 -07:00
Alex Cheema
d2184f583a keep track of already visited peers in global operations: collect_topology 2024-07-15 15:43:00 -07:00
Alex Cheema
4502da5bc4 readme bug notice 2024-07-15 15:23:58 -07:00
Alex Cheema
f9a201ddbf docs dir 2024-07-15 11:37:22 -07:00
Alex Cheema
4d43cb9174 logo 2024-07-15 11:37:12 -07:00
Alex Cheema
4e1f01eeec update links 2024-07-15 11:20:42 -07:00
Alex Cheema
1dea8b9c28 update discord link 2024-07-15 11:15:06 -07:00
Alex Cheema
544c229e8a discord link 2024-07-15 11:02:13 -07:00
Alex Cheema
98b30e056b readme tweak 2024-07-15 00:08:42 -07:00
Alex Cheema
963f8eb6a1 better logs for DEBUG>=1 2024-07-14 23:55:27 -07:00
Alex Cheema
a009f7d608 move examples to examples dir 2024-07-14 23:48:20 -07:00
Alex Cheema
b6595bac04 add llama-3-70b to the examples 2024-07-14 23:47:33 -07:00
Alex Cheema
54e8cad2d6 remove uneeded prints 2024-07-14 23:31:18 -07:00
Alex Cheema
c691205591 empty space 2024-07-14 23:29:20 -07:00
Alex Cheema
bcd58938de clean debug logs 2024-07-14 23:28:55 -07:00
Alex Cheema
b9c323bb07 memory-efficient shard loading 2024-07-14 23:27:57 -07:00
Alex Cheema
53a5b3fc6a add uuid requirement 2024-07-14 21:46:41 -07:00
Alex Cheema
05b9fa497d initialize node id to uuid4 if not set 2024-07-14 21:46:30 -07:00
Alex Cheema
ff597d9551 fix discovery 2024-07-14 21:46:13 -07:00
Alex Cheema
a04974168e fix model import path 2024-07-14 21:46:00 -07:00
Alex Cheema
b8a2a0fbe0 update readme run instruction 2024-07-14 21:26:56 -07:00
Alex Cheema
a933352ac3 add DEBUG flag for controlling debug logs 2024-07-14 21:26:45 -07:00
Alex Cheema
dd882fe6bc experimental notice 2024-07-14 21:16:20 -07:00
Alex Cheema
c8753ba5fe reshuffle readme 2024-07-14 21:12:55 -07:00
Alex Cheema
ee5204fbca readme installation instructions 2024-07-14 21:12:17 -07:00
Alex Cheema
78da11e10b slightly nicer readme 2024-07-14 21:05:41 -07:00
Alex Cheema
2fc472c8fe slightly nicer readme 2024-07-14 21:03:48 -07:00
Alex Cheema
8ff3e263a0 slightly nicer readme 2024-07-14 21:02:25 -07:00
Alex Cheema
32f2e36fd3 main rename 2024-07-14 21:01:28 -07:00
Alex Cheema
5bbde22a23 move everything under exo module 2024-07-14 21:00:37 -07:00
Alex Cheema
c851644a43 update requirements, specify exact versions 2024-07-14 20:55:29 -07:00
Alex Cheema
32972033dd update readme 2024-07-14 18:38:48 -07:00
Alex Cheema
5ef07d41a5 readme 2024-07-14 18:09:38 -07:00
Alex Cheema
490fa102a4 tinygrad inference engine 2024-07-14 13:07:37 -07:00
Alex Cheema
e6f387a690 handle is_finished 2024-07-13 23:27:34 -07:00
Alex Cheema
b01f69bb6b add support for multiple concurrent requests with request ids 2024-07-13 23:11:01 -07:00
Alex Cheema
7077652c8e graceful node shutdown 2024-07-13 20:43:37 -07:00
Alex Cheema
ca6095c04d a generic test for every inference engine 2024-07-13 18:25:26 -07:00
Alex Cheema
850b72d3ea make StatefulShardedModel callable, add some tests for mlx sharded inference 2024-07-13 15:41:15 -07:00
Alex Cheema
6ee0547eff fix layer calculation for sharded llama 2024-07-13 15:39:31 -07:00
Alex Cheema
445eda156c dynamically assign shards to nodes deterministically weighted by memory 2024-06-25 21:17:58 +01:00
Alex Cheema
36b8456798 collect global topology with local peer visibility, ring memory weighted partitioning strategy 2024-06-25 12:32:16 +01:00
Alex Cheema
3a66a0a4a8 add requirements.txt 2024-06-24 21:00:04 +01:00
Alex Cheema
ee96c6b023 add another test for device capabiities on MacBook Air 2024-06-24 20:59:55 +01:00
Alex Cheema
6c8c9ee7b1 topology with partitioning strategy 2024-06-24 20:56:50 +01:00