SAEHD model options. SAEHD模型选项。
Random_flip Random_flip
Turn the image from left to right by random rotation. Allows for better generalization of faces. Slows down training slightly until a clear face is achieved. If both src and dst face sets are quite diverse, this option is not useful. You can turn it off after a workout.
随机旋转图像,使其从左向右旋转。这有助于更好地泛化人脸。训练速度稍慢,直到获得清晰的人脸。如果源和目标人脸集非常多样化,则此选项可能没有用处。可以在训练完成后关闭此选项。
Batch_size Batch_size
Improves facial generalization, especially useful at an early stage. But it increases the time until a clear face is achieved. Increases memory usage. In terms of quality of the final fairy, the higher the value, the better. It's not worth putting it below 4.
改善脸部的泛化效果,尤其是在早期阶段特别有用。但它会延长清晰脸部出现的时间。增加内存使用量。就最终精灵的品质而言,值越高越好。不建议将其设置为4以下。
Resolution. 决议。
At first glance, the more the better. However, if the face in the frame is small, there is no point in choosing a large resolution. By increasing the resolution, the training time increases. For face_type=wf, more resolution is required, because the coverage of the face is larger, thus the details of the face are reduced. For wf it makes no sense to choose less than 224.
乍一看,像素越多越好。但是,如果画面中的人脸较小,就没有必要选择高分辨率。随着分辨率的增加,训练时间也会增加。对于face_type=wf的情况,需要更高的分辨率,因为人脸的覆盖范围较大,因此人脸的细节会减少。对于wf来说,选择低于224的分辨率没有意义。
Face_type. Face_type。
Face coverage in training. The more facial area is covered, the more plausible the result will be.
训练时的脸部遮挡。脸部遮挡的面积越大,结果就越可信。
The whole_face allows covering the area below the chin and forehead. However, there is no automatic removal of the mask with the forehead, so XSeg is required for the merge, either in Davinci Resolve or Adobe After Effects.
全脸遮罩可以覆盖下巴和额头的区域。但是,没有自动去除额头遮罩的功能,因此需要使用XSeg进行合并,无论是在DaVinci Resolve还是Adobe After Effects中。
Archi. 圆弧。
Liae makes more morph under dst face, but src face in it will still be recognized.
Liae会使目标面孔产生更多的变形效果,但其中的源面孔仍会被识别出来。
Df allows you to make the most believable face, but requires more manual work to collect a good variety of src facets and a final color matching.
Df可以让你制作出最逼真的面部效果,但需要更多的手动操作来收集各种各样的src面和最终的色彩匹配。
The effectiveness of hd architectures has not been proven at this time. The Hd architectures were designed to better smooth the subpixel transition of the face at micro displacements, but the micro shake is also eliminated at df, see below.
目前尚未证明高清架构的有效性。高清架构旨在更好地平滑微小位移下的像素过渡,但同时也消除了df中的微抖动,如下图所示。
Ae_dims. Ae_dims。
Dimensions of the main brain of the network, which is responsible for generating facial expressions created in the encoder and for supplying a variety of code to the decoder.
网络主脑的尺寸,负责生成编码器创建的面部表情,并向解码器提供各种代码。
E_dims. E_dims。
The dimensions of the encoder network that are responsible for face detection and further recognition. When these dimensions are not enough, and the facial chips are too diverse, then we have to sacrifice non-standard cases, those that are as much as possible different from the general cases, thus reducing their quality.
负责人脸检测和进一步识别的编码器网络的维度。如果这些维度不足,且人脸特征点过于多样化,我们就必须牺牲非标准案例,即尽可能与一般案例不同的案例,从而降低它们的质量。
D_dims. D_dims。
The network dimensions of the decoder, which are responsible for generating the image from the code obtained from the brain of the network. When these dimensions are not enough, and the weekend faces are too different in color, lighting, etc., you have to sacrifice the maximum allowed sharpness.
解码器的网络维度负责从网络获取的脑部代码生成图像。当这些维度不足时,并且周末面对颜色、光线等差异过大时,就必须牺牲允许的最大锐度。
D_mask_dims. D_mask_dims。
Dimensions of the mask decoder network, which are responsible for forming the mask image.
负责生成掩码图像的掩码解码器网络的尺寸。
16-22 is the normal value for a fake without an edited mask in XSeg editor.
16-22是XSeg编辑器中未编辑掩模的假值的正常值。
At the moment there is no experimentally proven data that would indicate which values are better. All we know is that if you put really low values, the error curve will reach the plateau quickly enough and the face will not reach clarity.
目前还没有任何经过实验验证的数据能够说明哪种值更好。我们所知道的是,如果将值设得非常低,误差曲线会很快达到平台,而图像的清晰度则无法达到。
Masked_training. (only for whole_face).
遮罩训练(仅适用于全脸)。
Enabled (default) - trains only the area inside the face mask, and anything outside that area is ignored. Allows the net to focus on the face only, thus speeding up facial training and facial expressions.
启用(默认) - 仅训练脸部面具内的区域,而该区域以外的任何内容都被忽略。 允许网络仅专注于脸部,从而加快面部训练和面部表情的学习。
When the face is sufficiently trained, you can disable this option, then everything outside the face - the forehead, part of the hair, background - will be trained.
当人脸被充分训练后,你可以关闭此选项,然后面部以外的所有内容(额头、部分头发、背景等)也将被训练。
Eyes_prio. Eyes_prio。
Set a higher priority for image reconstruction in the eye area. Thus improving the generalization and comparison of the eyes of two faces. Increases iteration time.
将眼部区域的图像重建优先级设置得更高。这样可以改善两张面孔的眼睛的泛化和比较效果。增加迭代时间。
Lr_dropout. Lr_dropout。
Include only when the face is already sufficiently trained. Enhance facial detail and improve subpixel facial transitions to reduce shake.
只有在人脸已经足够训练之后才启用此功能。增强面部细节,改善亚像素面部过渡,以减少抖动。
Spends more video memory. So when selecting a network configuration for your graphics card, consider enabling this option.
会占用更多的视频内存。因此,在为您的显卡选择网络配置时,请考虑启用此选项。
Random_warp. Random_warp。
Turn it off only when your face is already sufficiently trained. Allows you to improve facial detail and subpixel transitions of facial features, reducing shake.
只有在面部已经充分训练之后才关闭它。它可以改善面部细节和亚像素面部特征过渡,减少抖动。
GAN_power. GAN_power。
Allows for improved facial detail. Include only when the face is already sufficiently trained. Requires more memory, greatly increases iteration time.
可提升面部细节表现。仅在面部训练已经足够时才启用。需要更多的内存,迭代时间大大增加。
The work is based on the generative and adversarial principle. At first, you will see artifacts in areas that do not match the clarity of the target image, such as teeth, eye edges, etc. So train long enough.
该工作基于生成对抗网络原理。一开始,您会在一些与目标图像清晰度不匹配的区域看到伪影,比如牙齿、眼部边缘等。因此,请训练足够长的时间。
True_face_power. True_face_power。
Experimental option. You don't have to turn it on. Adjusts the predicted face to src in the most "hard way". Artifacts and incorrect light transfer from dst may appear.
实验性选项。您可以关闭它。它会以最“强硬”的方式将预测的脸部调整到src中。可能会出现一些瑕疵和从dst中错误传输的光线。
Face_style_power. Face_style_power。
Adjusts the color distribution of the predicted face in the area inside the mask to dst. Artefacts may appear. The face may become more like dst. The model may collapse.
调整位于掩模区域内的预测人脸的颜色分布以匹配dst。可能会出现伪影。人脸可能会变得更像dst。模型可能会崩溃。
Start at 0.0001 and watch the changes in preview_history, turn on the backup every hour.
从0.0001开始,观察预览历史的变化,每小时开启备份功能。
Bg_style_power. Bg_style_power。
Trains the area in the predicted face outside the face mask to be equal to the same area in the dst face. In this way the predicted face is similar to the morph in dst face with already less recognizable facial src features.
将预测出的脸部区域外的面罩区域与dst脸部的相同区域进行训练,这样,预测出的脸部与dst脸部的面部特征相似,但src脸部的面部特征已经不太容易辨认了。
The Face_style_power and Bg_style_power must work in pairs to make the complexion fit to dst and the background take from dst. Morph allows you to get rid of many problems with color and face matching, but at the expense of recognition in it src face.
面部风格化参数(Face_style_power)和背景风格化参数(Bg_style_power)必须成对使用,以使肤色与dst相匹配,并使背景从dst中提取。Morph可以帮助您解决许多颜色和面部匹配问题,但会以牺牲src面部识别为代价。
ct_mode. ct_mode。
It is used to fit the average color distribution of a face set src to dst. Unlike Face_style_power is a safer way, but not the fact that you get an identical color transfer. Try each one, look at the preview history which one is closer to dst and train on it.
它用于将源面部集合中的平均颜色分布映射到目标面部集合中。与Face_style_power不同,这是一种更安全的方法,但并不能保证得到完全相同的颜色转换。尝试使用每个方法,查看预览历史记录,看看哪个更接近目标面部集合,然后对其进行训练。
Clipgrad. Clipgrad。
It reduces the chance of a model collapse to almost zero. Model collapse occurs when artifacts appear or when the windows of the predicted faces are colored in the same color. Model collapse can occur when using some options or when there is not enough variety of face sets dst.
它几乎将模型崩溃的风险降为零。模型崩溃是指在预测的脸部区域出现人工制品或者预测的脸部窗口被同一颜色填充的情况。当使用某些选项或者没有足够的脸部数据集时,可能会发生模型崩溃。
Therefore, it is best to use autobackup every 2-4 hours, and if collapse occurs, roll back and turn on clipgrad. .
因此,最好每2-4小时自动备份一次,如果出现崩溃,则回滚并开启clipgrad。
Pretrain. Pretrain。
Engage model pre-training. Performed by 24 thousand people prepared in advance. Using the pre-trained model you accelerate the training of any fairy.
预训练模型。由2.4万人事先准备好。使用预训练模型可以加快任何精灵的训练速度。
It is recommended to train as long as possible. 1-2 days is good. 2 weeks is perfect. At the end of the pre-training, save the model files for later use. Switch off the option and train as usual.
建议尽可能长时间地进行训练。1-2天就很好。2周就完美了。在预训练结束后,请保存模型文件以备将来使用。关闭选项,然后像平时一样进行训练。
You can and should share your pre-trained model in the community.
你可以并且应该在社区中分享你的预训练模型。
Size of src and dst face set.
源和目的面集的大小。
The problem with a large number of src images is repetitive faces, which will play little role. Therefore, faces with rare angles will train less frequently, which has a bad effect on quality. Therefore, 3000-4000 faces are optimal for src facial recruitment. If you have more than 5000 faces, sort by best into fewer faces. Sorting will select from the optimal ratio of angles and color variety.
大量src图像的问题是重复的面孔,它们将发挥很小的作用。因此,具有罕见角度的面孔将较少地进行训练,这对质量有不好的影响。因此,3000-4000张面孔是src人脸招募的理想数量。如果您有超过5000张面孔,请按最佳比例将其分类为较少的面孔。分类将从最佳角度和颜色多样性中进行选择。
The same logic is true for dst. But dst is footage from video, each of which must be well trained to be identified by the neural network when it is closer. So if you have too many faces in dst, from 3000 and more, it is optimal to make their backup, then sort by best in 3000, train the network to say 100.000 iterations, then return the original number of dst faces and train further until the optimal result is achieved.
对于dst同样适用上边的这种逻辑。但是dst是视频中的画面,每个画面都需要经过良好的训练,以便在距离较近时被神经网络识别。因此,如果您在dst中有3000个或更多人脸,那么最佳做法是先备份它们,然后按照3000个最佳人脸进行排序,训练网络进行100,000次迭代,然后返回原始数量的dst人脸,并继续训练直到达到最优结果。
How to get lighting similar to dst face?
如何获得与dst face类似的光照效果?
It's about lighting, not color matching. It's just about collecting a more diverse src set of faces.
这与颜色匹配无关,而是关于照明。我们只是想收集更多样化的人脸数据集。
How to suppress color flickering in DF model?
如何在DF模型中抑制颜色闪烁?
If the src set of faces contains a variety of make-up, it can lead to color shimmering DF model. Option: At the end of your training, leave at least 150 faces of the same makeup and train for several hours.
如果人脸的src集合包含各种各样的妆容,可能会导致颜色闪烁的DF模型。选项:在完成训练后,至少保留150张相同妆容的人脸,并进行数小时的训练。
How else can you adjust the color of the predicted face to dst?
你还可以通过哪些方式来调整预测出的人脸颜色以匹配dst?
If nothing fits automatically, use the video editor and glue the faces in it. With the video editor, you get a lot more freedom to note colors.
如果没有自动匹配的选项,请使用视频编辑器并将人脸粘贴到其中。使用视频编辑器,您可以拥有更多自由来调整颜色。
How to make a face look more like src?
如何让一张脸看起来更像SRC?
1. Use DF architecture.
1. 使用DF架构。
2. Use a similar face shape in dst.
2. 在dst中使用与之相似的脸型。
3 It is known that a large color variety of facial src decreases facial resemblance, because a neural network essentially interpolates the face from what it has seen.
3 众所周知,面部特征的丰富多样性会降低面部相似度,因为神经网络本质上是从其所见过的面部特征中进行插值计算的。
For example, in your src set of faces from 7 different color scenes, and the sum of faces is only 1500, so under each dst scene will be used 1500 / 7 faces, which is 7 times poorer than if you use 1500 faces of one scene. As a result, the predicted face will be very different from the src.
例如,在你的src集合中包含了7种不同颜色场景下的人脸,而人脸总数只有1500个,那么在每个dst场景下将使用1500/7个人脸,这比在一个场景中使用1500个人脸要差7倍。因此,预测出的人脸将与src有很大不同。
Microquake the predicted face in the end video.
微型地震在最终视频中预测的地震面。
The higher the resolution of the model, the longer it needs to be trained to suppress the micro-shake.
模型的分辨率越高,就需要更长时间的训练来抑制微小的抖动。
You should also enable lr_dropout and disable random_warp after 200-300k iterations at batch_size 8.
你还应该在批量大小为8的情况下,在200,000至300,000次迭代后启用lr_dropout,并禁用random_warp。
It is not rare that the microshake can appear if the dst video is too clear. It is difficult for a neural network to distinguish unambiguous information about a face when it is overflowed with micro-pixel noise. Therefore, after extracting frames from dst video, before extracting faces, you can pass through the frames with the noise filter denoise data_dst images.bat. This filter will remove temporal noise.
如果dst视频过于清晰,可能会出现微小抖动。当微像素噪声充斥着一张脸的信息时,神经网络很难对其进行明确的区分。因此,在从dst视频中提取帧之前,先通过带有噪声过滤器的denoise data_dst images.bat文件对帧进行处理。该过滤器将去除临时噪声。
Also, ae_dims magnification may suppress the microshock.
此外,ae_dims放大功能可能有助于抑制微震。
Use a quick model to check the generalization of facial features.
使用快速模型来检查面部特征的泛化情况。
If you're thinking of a higher resolution fake, start by running at least a few hours at resolution 96. This will help identify facial generalization problems and correct facial sets.
如果你想制作更高分辨率的假人,请至少以96分辨率运行几个小时。这将有助于识别面部概括问题并纠正面部设置。
Examples of such problems:
以下是这类问题的一些例子:
1. Non-closing eyes/mouth - no closed eyes/mouth in src.
1. 没有闭合的眼睛/嘴巴 - 源文件中没有闭合的眼睛/嘴巴。
2. wrong face rotation - not enough faces with different turns in both src and dst face sets.
2. 错误的面旋转 - 源和目标面集都没有足够的具有不同旋转角度的面。
Training algorithm for achieving high definition.
实现高清晰度的训练算法。
1. use -ud model 1. 使用-ud模型
2. train, say, up to 300k.
2. 训练,例如,达到30万个。
3. enable learning rate dropout for 100k
3. 启用学习率下降策略,每10,000步执行一次。
4. disable random warp for 50k.
4. 关闭50,000像素内的随机扭曲功能。
5. 使用gan 5.
Do not use training GPU for video output.
不要使用训练 GPU 进行视频输出。
This can reduce performance, reduce the amount of free GPU video memory, and in some cases lead to OOM errors.
这会降低性能,减少可用的GPU显存,在某些情况下还会导致OOM错误。
Buy a second cheap video card such as GT 730 or a similar, use it for video output.
购买一张价格便宜的视频卡,比如GT 730或者类似的型号,用于视频输出。
There is also an option to use the built-in GPU in Intel processors. To do this, activate it in BIOS, install drivers, connect the monitor to the motherboard.
也可以使用英特尔处理器中的内置GPU。要这样做,请在BIOS中激活它,安装驱动程序,并将显示器连接到主板。
Using Multi-GPU. 使用Multi-GPU。
Multi-GPU can improve the quality of the fake. In some cases, it can also accelerate training.
多GPU可以提高假体的质量。在某些情况下,还可以加速训练。
Choose identical GPU models, otherwise the fast model will wait for the slow model, thus you will not get the acceleration.
选择相同的GPU型号,否则快速模型将等待慢速模型,从而无法获得加速效果。
Working Principle: batch_size is divided into each GPU. Accordingly, you either get the acceleration due to the fact that less work is allocated to each GPU, or you increase batch_size by the number of GPUs, increasing the quality of the fairy.
工作原理:将批次大小划分到每个GPU中。因此,要么由于分配给每个GPU的工作更少而获得加速,要么通过增加GPU的数量来增加批次大小,从而提高模型质量。
In some cases, disabling the model_opts_on_gpu can speed up your training when using 4 or more GPUs.
在某些情况下,当使用4个或更多GPU时,禁用model_opts_on_gpu可以加快您的训练速度。
As the number of samples increases, the load on the CPU to generate samples increases. Therefore it is recommended to use the latest generation CPU and memory.
随着样本数量的增加,生成样本所需的CPU负载也会增加。因此,建议使用最新一代的CPU和内存。
NVLink, SLI mot working and not used. Moreover, the SLI enabled may cause errors.
NVLink、SLI均未启用且未使用。此外,启用SLI可能会导致错误。
Factors that reduce fairy success.
影响精灵成功的因素。
1. Big face in the frame.
1. 画面中出现了一张大脸。
2. Side lights. Transitions lighting. Color lighting.
2. 侧灯。过渡照明。彩色照明。
3. not a diverse set of dst faces.
3. 不是一组多样化的dst面孔。
For example, you train a faceake, where the whole set of dst faces is a one-way turned head. Generating faces in this case can be bad. The solution: extract additional faces of the same actor, train them well enough, then leave only the target faces in dst.
例如,您训练了一个面部识别模型,其中整个dst面部数据集是一张旋转180度的脸。在这种情况下生成的面部图像可能效果不佳。解决方案:提取同一演员的其他面部图像,并对其进行良好的训练,然后仅保留dst中的目标面部图像。
Factors that increase the success of the fairy.
增加精灵成功几率的因素。
1. Variety of src faces: different angles including side faces. Variety of lighting.
1. 光源角度多样:包括侧面光源。光源类型多样。
Other. 其他。
In 2018, when fairies first appeared, people liked any lousy quality of fairies, where the face glimpsed, and was barely like a target celebrity. Now, even in a technically perfect replacement using a parodist similar to the target celebrity, the viral video effect may not be present at all. Popular youtube channels specializing in dipfeikas are constantly inventing something new to keep the audience. If you have watched and watched a lot of movies, know all the memo videos, you can probably come up with great ideas for dipfeik. A good idea is 50% success. The technical quality can be increased through practice.
2018年,仙女首次出现时,人们喜欢任何质量低劣的仙女,哪怕只是瞥见了她的脸,而且几乎与目标明星毫无相似之处。如今,即使用与目标明星相似的模仿者进行技术上完美的替换,病毒视频的效果也可能完全不存在。专注于“迪皮费卡”的热门YouTube频道不断推出新创意,以保持观众的兴趣。如果你看过很多电影,知道所有热门视频,你可能可以为“迪皮费卡”想出很棒的点子。一个好的点子就等于成功的50%。技术质量可以通过练习来提高。
Not all celebrity couples can be well used for a dipfeike. If the size of the skulls is significantly different, the similarity of the result will be extremely low. With experience dipfeik should understand what will be good fairies and what not.
并非所有的名人夫妇都适合进行脸型配对。如果脸型的大小差异显著,配对结果的相似度将会非常低。经过多次实践,脸型配对专家应该知道哪些脸型适合配对,哪些不适合。 |