mirror of
https://github.com/apple/ml-4m.git
synced 2024-07-16 14:20:27 +03:00
Added arXiv link and demo sample output
This commit is contained in:
@@ -74,6 +74,11 @@ preds = sampler({'rgb@224': img.cuda()}, seed=None)
|
||||
sampler.plot_modalities(preds, save_path=None)
|
||||
```
|
||||
|
||||
You should expect to see an output like the following:
|
||||
|
||||

|
||||

|
||||
|
||||
For performing caption-to-all generation, you can replace the sampler input by: `preds = sampler({'caption': 'A lake house with a boat in front [S_1]'})`.
|
||||
For a list of available 4M models, please see the model zoo below, and see [README_GENERATION.md](README_GENERATION.md) for more instructions on generation.
|
||||
|
||||
|
||||
BIN
assets/4M_demo_sample_darkmode.jpg
Normal file
BIN
assets/4M_demo_sample_darkmode.jpg
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 1.2 MiB |
BIN
assets/4M_demo_sample_lightmode.jpg
Normal file
BIN
assets/4M_demo_sample_lightmode.jpg
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 1.2 MiB |
@@ -11,7 +11,7 @@
|
||||
"\n",
|
||||
"(\\* Equal contribution, random order)\n",
|
||||
"\n",
|
||||
"[`Website`](https://4m.epfl.ch) | [`Paper`](TBD) | [`GitHub`](https://github.com/apple/ml-4m)\n",
|
||||
"[`Website`](https://4m.epfl.ch) | [`Paper`](https://arxiv.org/abs/2406.09406) | [`GitHub`](https://github.com/apple/ml-4m)\n",
|
||||
"\n",
|
||||
"We adopt the 4M framework to scale a vision model to tens of tasks and modalities. The resulting model, named 4M-21, has significantly expanded out-of-the-box capabilities, and yields stronger results on downstream transfer tasks. \n",
|
||||
"\n",
|
||||
|
||||
Reference in New Issue
Block a user