ollama-android-ui-Maid

mirror of https://github.com/MaidFoundation/maid.git synced 2023-12-01 22:17:36 +03:00

Author	SHA1	Message	Date
dane madsen	724393d70a	Update llama.cpp	2023-11-29 22:30:49 +10:00
dane madsen	a120e632d5	Switch llama.cpp fork to support GGUFv1	2023-11-05 22:45:57 +10:00
dane madsen	f5eb782a29	refactor	2023-11-04 19:27:04 +10:00
dane madsen	7532265514	Refactor	2023-11-04 08:46:35 +10:00
dane madsen	c832bb9d82	refactor	2023-11-04 08:46:34 +10:00
dane madsen	5a72239a92	Move stuff around	2023-11-02 18:27:38 +10:00
Gardner	d6dc71e45a	Fix issue with madvise()	2023-10-19 00:14:13 +13:00
dane madsen	c6730be3c2	Fix android not compiling	2023-10-15 20:03:56 +10:00
dane madsen	5b0144cc76	Update .gitmodules	2023-10-12 19:34:53 +10:00
Daniel Drake	e4834993f6	Update llama.cpp and move core processing to native code Update llama.cpp to the latest version as part of an effort to make this app usable on my Samsung Galaxy S10 smartphone. The newer llama.cpp includes a double-close fix which was causing the app to immediately crash upon starting the AI conversation (llama.cpp commit 47f61aaa5f76d04). It also adds support for 3B models, which are considerably smaller. The llama-7B models were causing Android's low memory killer to terminate Sherpa after just a few words of conversation, whereas new models such as orca-mini-3b.ggmlv3.q4_0.bin work on this device without quickly exhausting all available memory. llama.cpp's model compatibility has changed within this update, so ggml files that were working in the previous version are unlikely to work now; they need converting. However the orca-mini offering is already in the new format and works out of the box. llama.cpp's API has changed in this update. Rather than rework the Dart code, I opted to leave it in C++, using llama.cpp's example code as a base. This solution is included in a new "llamasherpa" library which calls into llama.cpp. Since lots of data is passed around in large arrays, I expect running this in Dart had quite some overhead, and this native approach should perform considerably faster. This eliminates the need for Sherpa's Dart code to call llama.cpp directly, so there's no need to separately maintain a modified version of llama.cpp and we can use the official upstream.	2023-07-01 21:22:38 +02:00
Maxime GUERIN	ac2b4dd7b9	add submodule	2023-03-27 20:17:39 +02:00

11 Commits