Commit Graph

1544 Commits

Author SHA1 Message Date
Alex Cheema
af734f1bf6 Merge pull request #737 from exo-explore/handlegzipdownload
handle -gzip suffix in etag for integrity check fixes #633
v0.0.15-alpha
2025-02-25 22:10:05 +00:00
Alex Cheema
ee095766d9 handle -gzip suffix in etag for integrity check fixes #633 2025-02-25 22:08:15 +00:00
Alex Cheema
a605e233ad Merge pull request #709 from exo-explore/notice
update notice in README
2025-02-18 11:43:14 +00:00
Alex Cheema
f9a1e5342b update notice in README 2025-02-18 11:41:09 +00:00
Alex Cheema
7a374a74cd Merge pull request #708 from exo-explore/notice
add notice to README
2025-02-17 22:55:44 +00:00
Alex Cheema
5a00899d73 Merge pull request #705 from cadenmackenzie/addingModelNameInputContainer
adding current model name to input container information
2025-02-17 22:55:29 +00:00
Alex Cheema
cb4bee2694 add notice to README 2025-02-17 22:54:56 +00:00
Caden MacKenzie
9078d094b9 adding current model name to input container information 2025-02-16 18:34:38 -08:00
Alex Cheema
ed70d47cfd Merge pull request #702 from exo-explore/alwayslogdownloaderror
make max_parallel_downloads configurable, increase download chunk size to 8MB
2025-02-14 21:27:12 +00:00
Alex Cheema
477e3a5e4c make max_parallel_downloads configurable, increase download chunk size to 8MB 2025-02-14 21:26:41 +00:00
Alex Cheema
be3b9ee973 Merge pull request #698 from exo-explore/alwayslogdownloaderror
always log download errors. some people e.g cant access huggingface
2025-02-13 22:56:33 +00:00
Alex Cheema
b4e6f8acad always log download errors. some people eg cant access huggingface which causes confusion 2025-02-13 22:55:09 +00:00
Alex Cheema
de99da7c75 Merge pull request #684 from divinity76/patch-1
workaround f16 cast ambiguity
2025-02-08 12:45:10 +00:00
Alex Cheema
76d1bd95f5 Merge pull request #688 from exo-explore/readmeupdate
apt-get debian noninteractive in circleci
2025-02-08 02:41:19 +00:00
Alex Cheema
928214d479 apt-get debian noninteractive in circleci 2025-02-08 02:40:51 +00:00
Alex Cheema
ce34a886c2 Merge pull request #687 from exo-explore/readmeupdate
README updates
2025-02-08 02:15:50 +00:00
Alex Cheema
d8c3aed0cc update discovery / peer networking modules 2025-02-08 02:15:13 +00:00
Alex Cheema
2c982d9295 update README to better reflect support for other devices like NVIDIA and Pi's 2025-02-08 02:13:04 +00:00
divinity76
5fe241ec61 code-breaking typo
oops
2025-02-06 19:02:02 +01:00
divinity76
05ff20fa89 workaround f16 cast ambiguity
for unknown reasons, without this, when trying to execute "Llama 3.2 1B", I get the error below. Fwiw I do not know the performance impact for this change. I can't even get exo running, but this change allows me to /get further/ (before running into a second issue with vram allocation? story for another day i suppose)


error: 
Failed to fetch completions: Error processing prompt (see logs with DEBUG>=2): Nvrtc Error 6, NVRTC_ERROR_COMPILATION
<null>(18): error: more than one user-defined conversion from "nv_bfloat16" to "half" applies:
            function "__half::__half(float)" (declared at line 214 of /usr/include/cuda_fp16.hpp)
            function "__half::__half(short)" (declared at line 227 of /usr/include/cuda_fp16.hpp)
            function "__half::__half(unsigned short)" (declared at line 228 of /usr/include/cuda_fp16.hpp)
            function "__half::__half(int)" (declared at line 229 of /usr/include/cuda_fp16.hpp)
            function "__half::__half(unsigned int)" (declared at line 230 of /usr/include/cuda_fp16.hpp)
            function "__half::__half(long long)" (declared at line 231 of /usr/include/cuda_fp16.hpp)
            function "__half::__half(unsigned long long)" (declared at line 232 of /usr/include/cuda_fp16.hpp)
    *((half4*)((data0+(alu0+(gidx1<<14)+(lidx0<<11)+alu1)))) = make_half4(((half)(val0)),((half)(val1)),((half)(val2)),((half)(val3)));
                                                                                 ^

<null>(18): error: more than one user-defined conversion from "nv_bfloat16" to "half" applies:
            function "__half::__half(float)" (declared at line 214 of /usr/include/cuda_fp16.hpp)
            function "__half::__half(short)" (declared at line 227 of /usr/include/cuda_fp16.hpp)
            function "__half::__half(unsigned short)" (declared at line 228 of /usr/include/cuda_fp16.hpp)
            function "__half::__half(int)" (declared at line 229 of /usr/include/cuda_fp16.hpp)
            function "__half::__half(unsigned int)" (declared at line 230 of /usr/include/cuda_fp16.hpp)
            function "__half::__half(long long)" (declared at line 231 of /usr/include/cuda_fp16.hpp)
            function "__half::__half(unsigned long long)" (declared at line 232 of /usr/include/cuda_fp16.hpp)
    *((half4*)((data0+(alu0+(gidx1<<14)+(lidx0<<11)+alu1)))) = make_half4(((half)(val0)),((half)(val1)),((half)(val2)),((half)(val3)));
                                                                                                ^

<null>(18): error: more than one user-defined conversion from "nv_bfloat16" to "half" applies:
            function "__half::__half(float)" (declared at line 214 of /usr/include/cuda_fp16.hpp)
            function "__half::__half(short)" (declared at line 227 of /usr/include/cuda_fp16.hpp)
            function "__half::__half(unsigned short)" (declared at line 228 of /usr/include/cuda_fp16.hpp)
            function "__half::__half(int)" (declared at line 229 of /usr/include/cuda_fp16.hpp)
            function "__half::__half(unsigned int)" (declared at line 230 of /usr/include/cuda_fp16.hpp)
            function "__half::__half(long long)" (declared at line 231 of /usr/include/cuda_fp16.hpp)
            function "__half::__half(unsigned long long)" (declared at line 232 of /usr/include/cuda_fp16.hpp)
    *((half4*)((data0+(alu0+(gidx1<<14)+(lidx0<<11)+alu1)))) = make_half4(((half)(val0)),((half)(val1)),((half)(val2)),((half)(val3)));
                                                                                                               ^

<null>(18): error: more than one user-defined conversion from "nv_bfloat16" to "half" applies:
            function "__half::__half(float)" (declared at line 214 of /usr/include/cuda_fp16.hpp)
            function "__half::__half(short)" (declared at line 227 of /usr/include/cuda_fp16.hpp)
            function "__half::__half(unsigned short)" (declared at line 228 of /usr/include/cuda_fp16.hpp)
            function "__half::__half(int)" (declared at line 229 of /usr/include/cuda_fp16.hpp)
            function "__half::__half(unsigned int)" (declared at line 230 of /usr/include/cuda_fp16.hpp)
            function "__half::__half(long long)" (declared at line 231 of /usr/include/cuda_fp16.hpp)
            function "__half::__half(unsigned long long)" (declared at line 232 of /usr/include/cuda_fp16.hpp)
    *((half4*)((data0+(alu0+(gidx1<<14)+(lidx0<<11)+alu1)))) = make_half4(((half)(val0)),((half)(val1)),((half)(val2)),((half)(val3)));
                                                                                                                              ^

4 errors detected in the compilation of "<null>".
2025-02-06 18:54:15 +01:00
Alex Cheema
b5fc4bc288 Merge pull request #675 from exo-explore/rmtenacity
remove tenacity dependency, implement simple retry logic instead
2025-02-03 21:58:08 +00:00
Alex Cheema
5157d80a46 remove tenacity dependency, implement simple retry logic instead 2025-02-03 21:56:38 +00:00
Alex Cheema
75914b4de8 Merge pull request #669 from pavel-rodionov/feature-local-models
Add toggle to show only models downloaded locally
2025-02-03 21:45:27 +00:00
Rodionov Pavel
d084dbe574 Add toggle to show only models downloaded locally 2025-02-01 23:45:19 -08:00
Alex Cheema
1a77a52d71 Merge pull request #666 from exo-explore/patchmanualdiscovery
patch for manual discovery, set known_peers
2025-02-01 23:07:21 +00:00
Alex Cheema
72329ba984 patch for manual discovery, set known_peers 2025-02-01 23:06:57 +00:00
Alex Cheema
f663b0afa2 Merge pull request #665 from exo-explore/resumedownload
add model downloading section to README
2025-02-01 20:23:58 +00:00
Alex Cheema
51b5c2ca9b add model downloading section to README 2025-02-01 20:23:05 +00:00
Alex Cheema
9a1f0a85e6 Merge pull request #664 from exo-explore/resumedownload
resumable downloads with integrity checks
2025-02-01 18:34:36 +00:00
Alex Cheema
2c0d17c336 beautiful download 2025-02-01 17:29:19 +00:00
Alex Cheema
7034ee0fcb resumable downloads with integrity checks 2025-02-01 13:22:51 +00:00
Alex Cheema
7a75fb09b2 Merge pull request #660 from exo-explore/robustdownload
cleanup tmp files on failed download
2025-01-30 20:25:15 +00:00
Alex Cheema
0bebf8dfde fix indent 2025-01-30 20:21:28 +00:00
Alex Cheema
55c4385db5 cleanup tmp files on failed download 2025-01-30 20:11:06 +00:00
Alex Cheema
90690a7d10 Merge pull request #647 from deftdawg/patch-1
Add 4-bit to the end of DeepSeek V3/R1 model descriptions
2025-01-30 19:49:38 +00:00
Alex Cheema
130d998d36 Merge pull request #659 from exo-explore/robustdownload
ensure exo dir on start, retry with exp backoff on file downloads
2025-01-30 19:49:00 +00:00
Alex Cheema
788c49784c retry fetch_file_list also 2025-01-30 19:45:12 +00:00
Alex Cheema
6b1c8635fc ensure exo dir on start, retry with exp backoff on file downloads 2025-01-30 19:40:35 +00:00
Alex Cheema
24c410c19c Merge pull request #653 from exo-explore/tinyfixes
Tiny fixes
2025-01-29 19:08:05 +00:00
Alex Cheema
f6ed830ba6 Merge pull request #651 from exo-explore/parallelise_model_loadin
parallelise model loading
2025-01-29 19:07:25 +00:00
Alex Cheema
e6b4f2993c fix prompt output spacing in tui 2025-01-29 19:01:30 +00:00
DeftDawg
a25e02c913 Add 4-bit to the end of DeepSeek V3/R1 model descriptions 2025-01-29 14:00:13 -05:00
Alex Cheema
3675804f4d throttle repo progress events and only send them out if something changed 2025-01-29 18:55:54 +00:00
Alex Cheema
96f1aecb05 only in_progress if any given file is in_progress 2025-01-29 18:43:43 +00:00
Alex Cheema
23a5030604 even if part of a file is downloaded it may not be in_progress 2025-01-29 18:39:23 +00:00
Alex Cheema
31b56e862f make a singleton thread pool executor for tinygrad since we always want it to run on the same thread 2025-01-29 18:37:09 +00:00
Alex Cheema
9f6c688d62 update tinygrad 2025-01-29 18:06:38 +00:00
Alex Cheema
4887be5103 parallelise model loading 2025-01-29 02:32:59 +00:00
Alex Cheema
75091e206b Merge pull request #650 from exo-explore/chatgpttimeout
increase chatgpt api response timeout to 900 seconds
2025-01-29 02:03:52 +00:00
Alex Cheema
141de0d011 increase chatgpt api response timeout to 900 seconds 2025-01-29 02:03:00 +00:00