Skip to content
This repository was archived by the owner on Sep 18, 2024. It is now read-only.

Commit ea90617

Browse files
authored
docs: refine integrate with jina (#492)
* docs: refine integrate with jina * docs: tweek words * docs: refine structure of jina integration * docs: create 3 tabs * docs: add volume mount * docs: upgrade version * docs: add embed with docarray * docs: refine comments add changelog * chore: bump docarray to 0.13.31 * docs: use mnt add output * docs: tweek words and refer docarray * docs: print run artifact id to output * chore: add changelog * docs: more text descriptions on artifact and zip * docs: restructure usage * docs: restructure usage * docs: fix output shape * docs: fix integration docarray host * docs: tweek words * docs: remove label for clip training
1 parent c4d3d9d commit ea90617

File tree

5 files changed

+108
-35
lines changed

5 files changed

+108
-35
lines changed

CHANGELOG.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -22,6 +22,10 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
2222

2323
- Bump `jina-hubble-sdk` to 0.8.1. ([#488](https://github.com/jina-ai/finetuner/pull/488))
2424

25+
- Improve integration section in documentation. ([#492](https://github.com/jina-ai/finetuner/pull/492))
26+
27+
- Bump `docarray` to 0.13.31. ([#492](https://github.com/jina-ai/finetuner/pull/492))
28+
2529
### Fixed
2630

2731
- Use `uri` to represent image content in documentation creating training data code snippet. ([#484](https://github.com/jina-ai/finetuner/pull/484))

docs/walkthrough/create-training-data.md

Lines changed: 0 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -49,12 +49,10 @@ train_da = DocumentArray([
4949
Document(
5050
content='pencil skirt slim fit available for sell',
5151
modality='text',
52-
tags={'finetuner_label': 'skirt-1'}
5352
),
5453
Document(
5554
uri='https://...skirt-1.png',
5655
modality='image',
57-
tags={'finetuner_label': 'skirt-1'}
5856
),
5957
],
6058
),
@@ -63,12 +61,10 @@ train_da = DocumentArray([
6361
Document(
6462
content='stripped over-sized shirt for sell',
6563
modality='text',
66-
tags={'finetuner_label': 'shirt-1'}
6764
),
6865
Document(
6966
uri='https://...shirt-1.png',
7067
modality='image',
71-
tags={'finetuner_label': 'shirt-1'}
7268
),
7369
],
7470
),

docs/walkthrough/integrate-with-jina.md

Lines changed: 99 additions & 28 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,7 @@
1+
# Integration
2+
13
(integrate-with-jina)=
2-
# Integrate with Jina
4+
## Fine-tuned model as Executor
35

46
Once fine-tuning is finished, it's time to actually use the model.
57
Finetuner, being part of the Jina ecosystem, provides a convenient way to use tuned models via [Jina Executors](https://docs.jina.ai/fundamentals/executor/).
@@ -8,14 +10,35 @@ We've created the [`FinetunerExecutor`](https://hub.jina.ai/executor/13dzxycc) w
810
More specifically, the executor exposes an `/encode` endpoint that embeds [Documents](https://docarray.jina.ai/fundamentals/document/) using the fine-tuned model.
911

1012
Loading a tuned model is simple! You just need to provide a few parameters under the `uses_with` argument when adding the `FinetunerExecutor` to the [Flow]((https://docs.jina.ai/fundamentals/flow/)).
13+
You have three options:
14+
15+
````{tab} Artifact id and token
16+
```python
17+
import finetuner
18+
from jina import Flow
19+
20+
finetuner.login()
1121
12-
````{tab} Python
22+
token = finetuner.get_token()
23+
run = finetuner.get_run(
24+
experiment_name='YOUR-EXPERIMENT',
25+
run_name='YOUR-RUN'
26+
)
27+
28+
f = Flow().add(
29+
uses='jinahub+docker://FinetunerExecutor/v0.9.2', # use v0.9.2-gpu for gpu executor.
30+
uses_with={'artifact': run.artifact_id, 'token': token},
31+
)
32+
```
33+
````
34+
````{tab} Locally saved artifact
1335
```python
1436
from jina import Flow
1537
1638
f = Flow().add(
17-
uses='jinahub+docker://FinetunerExecutor',
18-
uses_with={'artifact': 'model_dir/tuned_model', 'batch_size': 16},
39+
uses='jinahub+docker://FinetunerExecutor/v0.9.2', # use v0.9.2-gpu for gpu executor.
40+
uses_with={'artifact': '/mnt/YOUR-MODEL.zip'},
41+
volumes=['/your/local/path/:/mnt'] # mount your model path to docker.
1942
)
2043
```
2144
````
@@ -26,19 +49,48 @@ with:
2649
port: 51000
2750
protocol: grpc
2851
executors:
29-
uses: jinahub+docker://FinetunerExecutor
52+
uses: jinahub+docker://FinetunerExecutor/v0.9.2
3053
with:
31-
artifact: 'model_dir/tuned_model'
32-
batch_size: 16
54+
artifact: 'COPY-YOUR-ARTIFACT-ID-HERE'
55+
token: 'COPY-YOUR-TOKEN-HERE' # or better set as env
3356
```
3457
````
35-
```{admonition} FinetunerExecutor via source code
36-
:class: tip
37-
You can also use the `FinetunerExecutor` via source code by specifying `jinahub://FinetunerExecutor` under the `uses` parameter.
38-
However, using docker images is recommended.
58+
59+
As you can see, it's super easy!
60+
If you did not call `save_artifact`,
61+
you need to provide the `artifact_id` and `token`.
62+
`FinetunerExecutor` will automatically pull your model from the cloud storage to the container.
63+
64+
On the other hand,
65+
if you have saved artifact locally,
66+
please mount the zipped artifact to the docker container.
67+
`FinetunerExecutor` will unzip the artifact and load models.
68+
69+
You can start your flow with:
70+
71+
```python
72+
with f:
73+
# in this example, we fine-tuned a BERT model and embed a Document..
74+
returned_docs = f.post(
75+
on='/encode',
76+
inputs=DocumentArray(
77+
[
78+
Document(
79+
text='some text to encode'
80+
)
81+
]
82+
)
83+
)
84+
85+
for doc in returned_docs:
86+
print(f'Text of the returned document: {doc.text}')
87+
print(f'Shape of the embedding: {doc.embedding.shape}')
3988
```
4089

41-
As you can see, it's super easy! We just provided the model path and the batch size.
90+
```console
91+
Text of the returned document: some text to encode
92+
Shape of the embedding: (768,)
93+
```
4294

4395
In order to see what other options you can specify when initializing the executor, please go to the [`FinetunerExecutor`](https://hub.jina.ai/executor/13dzxycc) page and click on `Arguments` on the top-right side.
4496

@@ -47,28 +99,47 @@ In order to see what other options you can specify when initializing the executo
4799
The only required argument is `artifact`. We provide default values for others.
48100
```
49101

102+
(integrate-with-docarray)=
103+
## Embed DocumentArray
50104

51-
## Using `FinetunerExecutor`
52-
53-
Here's a simple code snippet demonstrating the `FinetunerExecutor` usage in the Flow:
105+
Similarly, you can embed the [DocumentArray](https://docarray.jina.ai/) with fine-tuned model:
54106

107+
````{tab} Artifact id and token
55108
```python
56109
from docarray import DocumentArray, Document
57-
from jina import Flow
110+
import finetuner
58111
59-
f = Flow().add(
60-
uses='jinahub+docker://FinetunerExecutor',
61-
uses_with={'artifact': 'model_dir/tuned_model', 'batch_size': 16},
112+
finetuner.login()
113+
114+
token = finetuner.get_token()
115+
run = finetuner.get_run(
116+
experiment_name='YOUR-EXPERIMENT',
117+
run_name='YOUR-RUN'
62118
)
63119
64-
with f:
65-
returned_docs = f.post(on='/encode', inputs=DocumentArray([Document(text='hello')]))
120+
da = DocumentArray([Document(text='some text to encode')])
66121
67-
for doc in returned_docs:
68-
print(f'Text of the returned document: {doc.text}')
69-
print(f'Shape of the embedding: {doc.embedding.shape}')
122+
da.post(
123+
'jinahub+docker://FinetunerExecutor/v0.9.2/encode',
124+
uses_with={'artifact': run.artifact_id, 'token': token},
125+
)
126+
```
127+
````
128+
````{tab} Locally saved artifact
129+
```python
130+
from docarray import DocumentArray, Document
131+
132+
da = DocumentArray([Document(text='some text to encode')])
133+
134+
da.post(
135+
'jinahub+docker://FinetunerExecutor/v0.9.2/encode,
136+
uses_with={'artifact': '/mnt/YOUR-MODEL.zip'},
137+
volumes=['/your/local/path/:/mnt'] # mount your model path to docker.
138+
)
139+
```
140+
````
141+
142+
```console
143+
Text of the returned document: some text to encode
144+
Shape of the embedding: (768,)
70145
```
71-
```bash
72-
Text of the returned document: hello
73-
Shape of the embedding: (1, 768)
74-
```

docs/walkthrough/save-model.md

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
(retrieve-tuned-model)=
2-
# Save Model
2+
# Save Artifact
33

44
Perfect!
55
Now you have started the fine-tuning job in the cloud.
@@ -17,8 +17,9 @@ experiment = finetuner.get_experiment('finetune-flickr-dataset')
1717
# connect to the run we created previously.
1818
run = experiment.get_run('finetune-flickr-dataset-efficientnet-1')
1919
print(f'Run status: {run.status()}')
20+
print(f'Run artifact id: {run.artifact_id}')
2021
print(f'Run logs: {run.logs()}')
21-
# save the model.
22+
# save the artifact.
2223
run.save_artifact('tuned_model')
2324
```
2425

@@ -28,6 +29,7 @@ you can see this in the terminal:
2829
```bash
2930
🔐 Successfully login to Jina Ecosystem!
3031
Run status: FINISHED
32+
Run Artifact id: 62972acb5de25a53fdbfcecc
3133
Run logs:
3234

3335
Training [2/2] ━━━━━━━━━━━━━━━━━━━━━━━━━━━ 50/50 0:00:00 0:01:08 • loss: 0.050

requirements.txt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
docarray[common]>=0.13.19
1+
docarray[common]>=0.13.31
22
jina-hubble-sdk>=0.8.1
33
requests>=2.27.1
44
rich>=12.4.4

0 commit comments

Comments
 (0)