Skip to content

Commit 7fb65db

Browse files
authored
Merge pull request #27 from CompVis/public
Add new models
2 parents f13bf9b + 682030b commit 7fb65db

File tree

12 files changed

+1884
-75
lines changed

12 files changed

+1884
-75
lines changed

README.md

Lines changed: 103 additions & 56 deletions
Large diffs are not rendered by default.

assets/birdhouse.png

757 KB
Loading

assets/txt2img-convsample.png

302 KB
Loading

assets/txt2img-preview.png

2.15 MB
Loading
Lines changed: 68 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,68 @@
1+
model:
2+
base_learning_rate: 0.0001
3+
target: ldm.models.diffusion.ddpm.LatentDiffusion
4+
params:
5+
linear_start: 0.0015
6+
linear_end: 0.0195
7+
num_timesteps_cond: 1
8+
log_every_t: 200
9+
timesteps: 1000
10+
first_stage_key: image
11+
cond_stage_key: class_label
12+
image_size: 64
13+
channels: 3
14+
cond_stage_trainable: true
15+
conditioning_key: crossattn
16+
monitor: val/loss
17+
use_ema: False
18+
19+
unet_config:
20+
target: ldm.modules.diffusionmodules.openaimodel.UNetModel
21+
params:
22+
image_size: 64
23+
in_channels: 3
24+
out_channels: 3
25+
model_channels: 192
26+
attention_resolutions:
27+
- 8
28+
- 4
29+
- 2
30+
num_res_blocks: 2
31+
channel_mult:
32+
- 1
33+
- 2
34+
- 3
35+
- 5
36+
num_heads: 1
37+
use_spatial_transformer: true
38+
transformer_depth: 1
39+
context_dim: 512
40+
41+
first_stage_config:
42+
target: ldm.models.autoencoder.VQModelInterface
43+
params:
44+
embed_dim: 3
45+
n_embed: 8192
46+
ddconfig:
47+
double_z: false
48+
z_channels: 3
49+
resolution: 256
50+
in_channels: 3
51+
out_ch: 3
52+
ch: 128
53+
ch_mult:
54+
- 1
55+
- 2
56+
- 4
57+
num_res_blocks: 2
58+
attn_resolutions: []
59+
dropout: 0.0
60+
lossconfig:
61+
target: torch.nn.Identity
62+
63+
cond_stage_config:
64+
target: ldm.modules.encoders.modules.ClassEmbedder
65+
params:
66+
n_classes: 1001
67+
embed_dim: 512
68+
key: class_label
Lines changed: 71 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,71 @@
1+
model:
2+
base_learning_rate: 5.0e-05
3+
target: ldm.models.diffusion.ddpm.LatentDiffusion
4+
params:
5+
linear_start: 0.00085
6+
linear_end: 0.012
7+
num_timesteps_cond: 1
8+
log_every_t: 200
9+
timesteps: 1000
10+
first_stage_key: image
11+
cond_stage_key: caption
12+
image_size: 32
13+
channels: 4
14+
cond_stage_trainable: true
15+
conditioning_key: crossattn
16+
monitor: val/loss_simple_ema
17+
scale_factor: 0.18215
18+
use_ema: False
19+
20+
unet_config:
21+
target: ldm.modules.diffusionmodules.openaimodel.UNetModel
22+
params:
23+
image_size: 32
24+
in_channels: 4
25+
out_channels: 4
26+
model_channels: 320
27+
attention_resolutions:
28+
- 4
29+
- 2
30+
- 1
31+
num_res_blocks: 2
32+
channel_mult:
33+
- 1
34+
- 2
35+
- 4
36+
- 4
37+
num_heads: 8
38+
use_spatial_transformer: true
39+
transformer_depth: 1
40+
context_dim: 1280
41+
use_checkpoint: true
42+
legacy: False
43+
44+
first_stage_config:
45+
target: ldm.models.autoencoder.AutoencoderKL
46+
params:
47+
embed_dim: 4
48+
monitor: val/rec_loss
49+
ddconfig:
50+
double_z: true
51+
z_channels: 4
52+
resolution: 256
53+
in_channels: 3
54+
out_ch: 3
55+
ch: 128
56+
ch_mult:
57+
- 1
58+
- 2
59+
- 4
60+
- 4
61+
num_res_blocks: 2
62+
attn_resolutions: []
63+
dropout: 0.0
64+
lossconfig:
65+
target: torch.nn.Identity
66+
67+
cond_stage_config:
68+
target: ldm.modules.encoders.modules.BERTEmbedder
69+
params:
70+
n_embed: 1280
71+
n_layer: 32

0 commit comments

Comments
 (0)