Skip to content

fix NoCacheLoadAndPost data corruption #2532

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

VVoidV
Copy link
Contributor

@VVoidV VVoidV commented Sep 28, 2024

The size_orgmeta was not updated correctly in cases of insufficient cache, which resulted in the previously uploaded data not being loaded correctly when re-entering the NoCacheLoadAndPost process

Details

if(size_orgmeta <= offset){
// all area is over of original size
need_load_size = 0;
}else{
if(size_orgmeta < (offset + oneread)){
// original file size(on S3) is smaller than request.
need_load_size = size_orgmeta - offset;
}else{
need_load_size = oneread;
}
}

The size_orgmeta was not updated correctly in cases of insufficient cache,
which resulted in the previously uploaded data not being loaded correctly
when re-entering the NoCacheLoadAndPost process

Signed-off-by: liubingrun <liubr1@chinatelecom.cn>
@gaul
Copy link
Member

gaul commented Sep 29, 2024

Is there any kind of test we can add that exercises these code paths?

@VVoidV
Copy link
Contributor Author

VVoidV commented Sep 29, 2024

Is there any kind of test we can add that exercises these code paths?

sure, will try and come back later.

@ggtakec
Copy link
Member

ggtakec commented Sep 29, 2024

@VVoidV Thanks for this PR.
As you pointed out, I think I missed updating the size.
I think this test case is very tricky, but we'll wait for you to get back.

@VVoidV
Copy link
Contributor Author

VVoidV commented Sep 29, 2024

I come back, with bad news...

Initially, I intended to limit the remaining disk space to verify my changes. However, I found that after writing the file, the ls command hangs. So, I rolled back the code to the master. But the hang still occurred.

It seems that we have introduced a new deadlock in the master. I set fake_diskfree to 200 and copied a 180MB file to the test directory, and s3fs_release is hanging on fdent_lock.

Thread 3 (Thread 0x7f557f640640 (LWP 68656) "s3fs"):
#0  0x00007f5580286960 in __lll_lock_wait () from /lib64/libc.so.6
#1  0x00007f558028cff2 in pthread_mutex_lock@@GLIBC_2.2.5 () from /lib64/libc.so.6
#2  0x000000000046c2c4 in __gthread_mutex_lock (__mutex=0x7f557806caf0) at /usr/include/c++/11/x86_64-redhat-linux/bits/gthr-default.h:749
#3  std::mutex::lock (this=0x7f557806caf0) at /usr/include/c++/11/bits/std_mutex.h:100
#4  std::lock_guard<std::mutex>::lock_guard (__m=..., this=<synthetic pointer>) at /usr/include/c++/11/bits/std_mutex.h:229
#5  FdEntity::Clear (this=0x7f557806caf0) at fdcache_entity.cpp:124
#6  0x000000000046c512 in FdEntity::~FdEntity (this=0x7f557806caf0, __in_chrg=<optimized out>) at fdcache_entity.cpp:119
#7  0x0000000000467b12 in std::default_delete<FdEntity>::operator() (this=<optimized out>, __ptr=0x7f557806caf0) at /usr/include/c++/11/bits/unique_ptr.h:79
#8  std::default_delete<FdEntity>::operator() (__ptr=0x7f557806caf0, this=<optimized out>) at /usr/include/c++/11/bits/unique_ptr.h:79
#9  std::unique_ptr<FdEntity, std::default_delete<FdEntity> >::~unique_ptr (this=0x7f55780c0ea0, __in_chrg=<optimized out>) at /usr/include/c++/11/bits/unique_ptr.h:361
#...........
#18 FdManager::Close (this=0x4af3e0 <FdManager::singleton>, ent=0x7f557806caf0, fd=<optimized out>) at fdcache.cpp:737
#19 0x0000000000479ba9 in AutoFdEntity::Close (this=this@entry=0x7f557f63f900) at fdcache_auto.cpp:42
#20 AutoFdEntity::~AutoFdEntity (this=this@entry=0x7f557f63f9a0, __in_chrg=<optimized out>) at fdcache_auto.cpp:36
#21 0x000000000041bb83 in s3fs_release (_path=<optimized out>, fi=0x7f557f63fb10) at s3fs.cpp:3091
#22 0x00007f5581239183 in fuse_do_release () from /lib64/libfuse.so.2
#23 0x00007f558123eeaa in fuse_lib_release () from /lib64/libfuse.so.2
#24 0x00007f558124008d in do_release.lto_priv () from /lib64/libfuse.so.2
#25 0x00007f558124b9ac in fuse_ll_process_buf () from /lib64/libfuse.so.2
#26 0x00007f558123c9ad in fuse_do_work () from /lib64/libfuse.so.2
#27 0x00007f5580289c02 in start_thread () from /lib64/libc.so.6
#28 0x00007f558030ec40 in clone3 () from /lib64/libc.so.6

@VVoidV
Copy link
Contributor Author

VVoidV commented Sep 29, 2024

I am confident that version 1.94, combined with the following patch(picked from1a50b9a04a82678b05e36927c323f42d94ca4a07) , should not cause such a deadlock.

0001-bugfix-Fixed-a-deadlock-in-the-FdManager-ChangeEntit.patch

@VVoidV
Copy link
Contributor Author

VVoidV commented Sep 29, 2024

I am confident that version 1.94, combined with the following patch(picked from1a50b9a04a82678b05e36927c323f42d94ca4a07) , should not cause such a deadlock.

0001-bugfix-Fixed-a-deadlock-in-the-FdManager-ChangeEntit.patch

After testing with git bisect, I confirmed that the first bad commit is 1a50b9a.

@VVoidV
Copy link
Contributor Author

VVoidV commented Sep 29, 2024

oh, It seems to be related to UpdateEntityToTempPath();, as FdEntity is being destructed twice.
image

I tried to fix this issue. Please check the case when cachedir is not specified (using the /tmp directory). At this point, temporary files are already being used. Is there still a need to switch to another tmpfile? @ggtakec

When the cachedir is not specified, FdEntity already uses a randomly prefixed key.
Therefore, when handling UpdateEntityToTempPath, it is necessary to perform a search through the map.
Otherwise, the fdent map will insert duplicate entries, leading to repeated destructors being called, which triggers the deadlock.

Signed-off-by: liubingrun <liubr1@chinatelecom.cn>
@VVoidV
Copy link
Contributor Author

VVoidV commented Sep 29, 2024

Is there any kind of test we can add that exercises these code paths?

sure, will try and come back later.

Reproduce

image
For the test case, it is indeed difficult to reproduce. I used an additional tmpdir parameter to specify the temporary directory and controlled its capacity to 6000MB, with ensure_diskfree set to 2048MB.

[root@centos8-original ~]# df /mnt/tmp_cache/ -m
Filesystem     1M-blocks  Used Available Use% Mounted on
tmpfs               6000     0      6000   0% /mnt/tmp_cache
s3fs -o use_path_request_style -o url=http://127.0.0.1:8000 -o use_xattr=1 -o stat_cache_expire=1 -o stat_cache_interval_expire=1 -o retries=3 -f -o ensure_diskfree=2048 -o use_xattr -o update_parent_dir_stat -o passwd_file=/root/.passwd-s3fs -o tmpdir=/mnt/tmp_cache/ test /mnt/s3

When copying a 10GB file to the s3fs directory, the process first enters NoCacheUploadAndPost, then executes RowFlushMixMultipart. As it continues writing, it re-enters NoCacheUploadAndPost.

At this point, the first segment in the pagelist needs to download data from S3. However, since size_orgmeta is not updated, this portion of data is not loaded, resulting in the subsequent upload of all-zero data.

image

[root@centos8-original ~]# diff /mnt/src/10G /mnt/s3/10G
Binary files /mnt/src/10G and /mnt/s3/10G differ

test after fix

Using the modifications from this PR and executing the same test, the files are consistent.

[root@centos8-original ~]# cp /mnt/src/10G /mnt/s3/
[root@centos8-original ~]# diff /mnt/src/10G /mnt/s3/10G
[root@centos8-original ~]# 

if(GetStatsHasLock(st)){
size_orgmeta = st.st_size;
} else {
S3FS_PRN_ERR("fstat is failed by errno(%d), but continue...", errno);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this return errno instead?

if(GetStatsHasLock(st)){
size_orgmeta = st.st_size;
} else {
S3FS_PRN_ERR("fstat is failed by errno(%d), but continue...", errno);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above.

@ggtakec
Copy link
Member

ggtakec commented Oct 1, 2024

@VVoidV

I tried to fix this issue. Please check the case when cachedir is not specified (using the /tmp directory). At this point, temporary files are already being used. Is there still a need to switch to another tmpfile? @ggtakec

It took me a while to reproduce this problem.
Although I still can't reproduce it, I have identified what I think is the cause of the multiple release.

Could you test it by mixing the following into your code?

diff --git a/src/fdcache.cpp b/src/fdcache.cpp
index 16cb501..e20bf3f 100644
--- a/src/fdcache.cpp
+++ b/src/fdcache.cpp
@@ -739,6 +739,16 @@ bool FdManager::Close(FdEntity* ent, int fd)
                 // check another key name for entity value to be on the safe side
                 for(; iter != fent.end(); ){
                     if(iter->second.get() == ent){
+                        // [NOTE]
+                        // Since "ent" has already been freed just before this loop,
+                        // it must call "release()" here without being freed.
+                        // The return value of release() can be ignored, but clang-tidy
+                        // will warn.
+                        //
+                        const FdEntity* enttmp = iter->second.release();
+                        if(enttmp != ent){
+                            S3FS_PRN_DBG("Something wrong because the entity pointer is not same, but so nothing to do.");
+                        }
                         iter = fent.erase(iter);
                     }else{
                         ++iter;

This problem is probably not a problem with UpdateEntityToTempPath(), but rather caused by the fact that the possibility of the same unique_ptr being set multiple times in the map was overlooked when fixing #2383.

The above fix code has been filed as a PR (#2538), so I think it will be reviewed and merged.
If possible, please test whether the problem is resolved by referring to the above diff.

Thanks in advance for your kndness.

@VVoidV
Copy link
Contributor Author

VVoidV commented Oct 2, 2024

This problem is probably not a problem with UpdateEntityToTempPath(), but rather caused by the fact that the possibility of the same unique_ptr being set multiple times in the map was overlooked when fixing #2383.

The above fix code has been filed as a PR (#2538), so I think it will be reviewed and merged. If possible, please test whether the problem is resolved by referring to the above diff.

Thanks in advance for your kndness.

image
image

I observed that when the cache dir is not specified, the keys stored in the fent map are already random paths. However, the keys placed in except_fent by ChangeEntityToTempPath still use the actual paths. This discrepancy causes issues during the update, as it cannot find the corresponding entries in the fent based on the path, leading it to enter the else branch. Here is the point that the FdEntity pointer is inserted a second time

s3fs-fuse/src/fdcache.cpp

Lines 772 to 792 in b283ab2

fdent_map_t::iterator iter = fent.find(except_iter->first);
if(fent.end() != iter && iter->second.get() == except_iter->second){
// Move the entry to the new key
fent[tmppath] = std::move(iter->second);
iter = fent.erase(iter);
except_iter = except_fent.erase(except_iter);
}else{
// [NOTE]
// ChangeEntityToTempPath method is called and the FdEntity pointer
// set into except_fent is mapped into fent.
// And since this method is always called before manipulating fent,
// it will not enter here.
// Thus, if it enters here, a warning is output.
//
S3FS_PRN_WARN("For some reason the FdEntity pointer(for %s) is not found in the fent map. Recovery procedures are being performed, but the cause needs to be identified.", except_iter->first.c_str());
// Add the entry for recovery procedures
fent[tmppath] = std::unique_ptr<FdEntity>(except_iter->second);
except_iter = except_fent.erase(except_iter);
}
}

gaul added a commit to gaul/s3fs-fuse that referenced this pull request Oct 12, 2024
This relies on the std::enable_shared_from_this helper to create a
shared_ptr from this.  Fixes s3fs-fuse#2532.
gaul added a commit to gaul/s3fs-fuse that referenced this pull request Oct 12, 2024
This relies on the std::enable_shared_from_this helper to create a
shared_ptr from this.  Fixes s3fs-fuse#2532.
gaul added a commit to gaul/s3fs-fuse that referenced this pull request Oct 12, 2024
FdEntity may have multiple references due to ChangeEntityToTempPath.
This relies on the std::enable_shared_from_this helper to create a
std::shared_ptr from this.  Fixes s3fs-fuse#2532.
gaul added a commit to gaul/s3fs-fuse that referenced this pull request Oct 12, 2024
FdEntity may have multiple references due to ChangeEntityToTempPath.
This relies on the std::enable_shared_from_this helper to create a
std::shared_ptr from this.  Fixes s3fs-fuse#2532.
gaul added a commit to gaul/s3fs-fuse that referenced this pull request Oct 12, 2024
FdEntity may have multiple references due to ChangeEntityToTempPath.
This relies on the std::enable_shared_from_this helper to create a
std::shared_ptr from this.  Fixes s3fs-fuse#2532.
gaul added a commit to gaul/s3fs-fuse that referenced this pull request Oct 12, 2024
FdEntity may have multiple references due to ChangeEntityToTempPath.
This relies on the std::enable_shared_from_this helper to create a
std::shared_ptr from this.  Fixes s3fs-fuse#2532.
gaul added a commit to gaul/s3fs-fuse that referenced this pull request Oct 19, 2024
FdEntity may have multiple references due to ChangeEntityToTempPath.
This relies on the std::enable_shared_from_this helper to create a
std::shared_ptr from this.  Fixes s3fs-fuse#2532.
@ggtakec ggtakec closed this in 4b6e532 Oct 20, 2024
@ggtakec
Copy link
Member

ggtakec commented Oct 20, 2024

After merging #2541, this PR was automatically closed for some reason, so I will reopen it.

@ggtakec ggtakec reopened this Oct 20, 2024
Copy link
Member

@gaul gaul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you test the latest master and rebase or close this PR as needed?

@VVoidV
Copy link
Contributor Author

VVoidV commented Oct 30, 2024

Sorry for the late reply. I tested again with the 1.95 code. The cachedir is not specified, so the key in fdent is already a temp path. When ChangeEntityToTempPath() is called, the key in except_fent becomes the real path. This causes UpdateEntityToTempPath to not find the entity, then put it back into fdent. This results in two identical FdEntity entries in the fdent map.
I am not sure this is expected. @gaul @ggtakec
image

image

@ggtakec
Copy link
Member

ggtakec commented Oct 30, 2024

@VVoidV I think I understand the problem.

If the cache directory is not specified (there may be cases already FdEntity::NoCacheLoadAndPost had been called), we should not execute the operations of FdManager::ChangeEntityToTempPath because the target path no longer exists in fent.
So that on this case we should not execute FdManager::UpdateEntityToTempPath, too.

I have posted a PR(as Draft) for this fix in #2582.
Could you try this code?

@ggtakec
Copy link
Member

ggtakec commented May 31, 2025

@VVoidV Sorry for the very late merge of this #2582.
If you are still able to try this, please do.
Thanks in advance for your kindness.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy