zipseq file type mix handling
This commit is contained in:
@@ -251,3 +251,305 @@ No linting errors were found. The changes are ready to use.
|
||||
|
||||
---
|
||||
|
||||
_**User (2025-12-29 17:37Z)**_
|
||||
|
||||
does zipseq currently scan other types in the seq than png? e.g. jpg, exr
|
||||
|
||||
---
|
||||
|
||||
_**Agent (model default, mode Agent)**_
|
||||
|
||||
|
||||
|
||||
---
|
||||
|
||||
|
||||
Checking which file types `zipseq` scans:
|
||||
|
||||
|
||||
---
|
||||
|
||||
<tool-use data-tool-type="read" data-tool-name="read_file">
|
||||
<details><summary>Tool use: **read_file** • Read file: zip_sequences.py</summary>
|
||||
|
||||
</details>
|
||||
</tool-use>
|
||||
|
||||
---
|
||||
|
||||
<tool-use data-tool-type="read" data-tool-name="read_file">
|
||||
<details><summary>Tool use: **read_file** • Read file: zip_sequences.py</summary>
|
||||
|
||||
</details>
|
||||
</tool-use>
|
||||
|
||||
---
|
||||
|
||||
|
||||
|
||||
---
|
||||
|
||||
|
||||
Yes. `zipseq` scans multiple file types, not just PNG.
|
||||
|
||||
```45:52:zip_sequences.py
|
||||
SEQUENCE_EXTENSIONS = {
|
||||
".png",
|
||||
".jpg",
|
||||
".jpeg",
|
||||
".tif",
|
||||
".tiff",
|
||||
".exr",
|
||||
}
|
||||
```
|
||||
|
||||
The `find_sequence_dirs()` function uses this set to identify sequence directories:
|
||||
|
||||
```491:491:zip_sequences.py
|
||||
has_frames = any(Path(dirpath, f).suffix.lower() in SEQUENCE_EXTENSIONS for f in filenames)
|
||||
```
|
||||
|
||||
Supported extensions:
|
||||
- `.png`
|
||||
- `.jpg`
|
||||
- `.jpeg`
|
||||
- `.tif`
|
||||
- `.tiff`
|
||||
- `.exr`
|
||||
|
||||
All of these are included in archives and tracked in manifests.
|
||||
|
||||
---
|
||||
|
||||
_**User (2025-12-29 17:37Z)**_
|
||||
|
||||
if there's a sequence with both jpg and png (by mistake) will it scan and compress both?
|
||||
|
||||
---
|
||||
|
||||
_**Agent (model default, mode Agent)**_
|
||||
|
||||
|
||||
|
||||
---
|
||||
|
||||
|
||||
Yes. If a sequence directory contains both JPG and PNG files, both will be scanned and compressed.
|
||||
|
||||
How it works:
|
||||
1. `find_sequence_dirs()` checks if a directory has any files matching `SEQUENCE_EXTENSIONS` (`.png`, `.jpg`, `.jpeg`, `.tif`, `.tiff`, `.exr`). If it finds any, it treats the directory as a sequence directory.
|
||||
2. Once identified as a sequence directory, `iter_sequence_files()` includes all files in that directory (except `Thumbs.db` and archive paths), regardless of extension:
|
||||
|
||||
```517:524:zip_sequences.py
|
||||
def iter_sequence_files(seq_dir: Path) -> Iterator[Path]:
|
||||
for dirpath, dirnames, filenames in os.walk(seq_dir):
|
||||
path = Path(dirpath)
|
||||
dirnames[:] = [d for d in dirnames if d not in ("_archive", "_CURRENT")]
|
||||
if is_archive_path(path):
|
||||
continue
|
||||
for filename in filenames:
|
||||
# Ignore Thumbs.db files completely
|
||||
if filename.lower() == "thumbs.db":
|
||||
continue
|
||||
yield path / filename
|
||||
```
|
||||
|
||||
So if a directory has both `frame_001.jpg` and `frame_001.png`, both will be:
|
||||
- Included in the state manifest
|
||||
- Compressed into the archive
|
||||
|
||||
The extension check only determines whether a directory is a sequence directory; it doesn't filter which files get included once it's identified as one.
|
||||
|
||||
---
|
||||
|
||||
_**User (2025-12-29 17:37Z)**_
|
||||
|
||||
I had one with jpgs that I removed so only the pngs remain, but the manifest isn't changing.
|
||||
|
||||
---
|
||||
|
||||
_**Agent (model default, mode Agent)**_
|
||||
|
||||
|
||||
|
||||
---
|
||||
|
||||
|
||||
Checking the state comparison logic to see why the manifest isn't updating after removing JPGs:
|
||||
|
||||
|
||||
---
|
||||
|
||||
<tool-use data-tool-type="read" data-tool-name="read_file">
|
||||
<details><summary>Tool use: **read_file** • Read file: zip_sequences.py</summary>
|
||||
|
||||
</details>
|
||||
</tool-use>
|
||||
|
||||
---
|
||||
|
||||
<tool-use data-tool-type="search" data-tool-name="codebase_search">
|
||||
<details>
|
||||
<summary>Tool use: **codebase_search** • Searched codebase "How does the quick skip check work when directory mtime is older than archive?" • **14** results</summary>
|
||||
|
||||
| File | Lines |
|
||||
|------|-------|
|
||||
| `zip_sequences.py` | L981-1156 |
|
||||
| `zip_sequences.py` | L979-1156 |
|
||||
| `scripts_old/0MoveToCurrent.bat` | L61 |
|
||||
| `zip_sequences.py` | L563-650 |
|
||||
| `zip_sequences.py` | L471-513 |
|
||||
| `zip_sequences.py` | L516-560 |
|
||||
| `zip_sequences.py` | L1159-1286 |
|
||||
| `zip_sequences.py` | L676-969 |
|
||||
| `zip_sequences.py` | L861-969 |
|
||||
| `zip_sequences.py` | L984-1286 |
|
||||
| `UpdateSequences.ps1` | L54-96 |
|
||||
| `zip_sequences.py` | L584-657 |
|
||||
| `zip_sequences.py` | L861-976 |
|
||||
| `zip_sequences.py` | L557-650 |
|
||||
|
||||
</details>
|
||||
</tool-use>
|
||||
|
||||
---
|
||||
|
||||
|
||||
|
||||
---
|
||||
|
||||
|
||||
The quick-skip optimization is likely the cause. It skips state computation if the directory mtime is <= the archive mtime, which can miss deletions when the directory mtime doesn't update. Checking the logic and making it more reliable:
|
||||
|
||||
|
||||
---
|
||||
|
||||
<tool-use data-tool-type="read" data-tool-name="read_file">
|
||||
<details><summary>Tool use: **read_file** • Read file: zip_sequences.py</summary>
|
||||
|
||||
</details>
|
||||
</tool-use>
|
||||
|
||||
---
|
||||
|
||||
|
||||
|
||||
---
|
||||
|
||||
|
||||
The quick-skip optimization can miss file deletions when the directory mtime doesn't update. Adding a pre-check: if any files from the stored state are missing, we'll recompute the state instead of skipping.
|
||||
|
||||
|
||||
---
|
||||
|
||||
<tool-use data-tool-type="write" data-tool-name="search_replace">
|
||||
|
||||
<details><summary>Tool use: **search_replace** • Edit file: zip_sequences.py</summary>
|
||||
|
||||
**Chunk 1**
|
||||
Lines added: 29, lines removed: 15
|
||||
|
||||
```diff
|
||||
@@ -1030,19 +1030,33 @@
|
||||
if zip_path.exists() and stored_state is not None:
|
||||
# Quick check: if directory mtime is older than archive, likely unchanged
|
||||
- try:
|
||||
- dir_mtime = seq_dir.stat().st_mtime_ns
|
||||
- archive_mtime = zip_path.stat().st_mtime_ns
|
||||
- # If directory wasn't modified since archive was created, skip state computation
|
||||
- if dir_mtime <= archive_mtime:
|
||||
- quick_skipped += 1
|
||||
- if quick_skipped <= 5:
|
||||
- log("scan", f"Skipping {rel} (unchanged since archive)")
|
||||
- # Still need to check for old .zip cleanup (we have .7z, so .zip is obsolete)
|
||||
- if old_zip_path and old_zip_path.exists():
|
||||
- old_zip_path.unlink(missing_ok=True)
|
||||
- old_state_path = state_path_for(old_zip_path)
|
||||
- if old_state_path.exists():
|
||||
- old_state_path.unlink(missing_ok=True)
|
||||
- continue
|
||||
+ # But first verify that all files in stored state still exist (catches deletions)
|
||||
+ try:
|
||||
+ # Quick check: verify stored files still exist (catches file deletions)
|
||||
+ stored_files_exist = True
|
||||
+ for entry in stored_state.get("files", []):
|
||||
+ # Skip Thumbs.db entries (they're filtered out anyway)
|
||||
+ if Path(entry.get("path", "")).name.lower() == "thumbs.db":
|
||||
+ continue
|
||||
+ file_path = seq_dir / entry["path"]
|
||||
+ if not file_path.exists():
|
||||
+ stored_files_exist = False
|
||||
+ break
|
||||
+
|
||||
+ # Only do mtime check if all stored files still exist
|
||||
+ if stored_files_exist:
|
||||
+ dir_mtime = seq_dir.stat().st_mtime_ns
|
||||
+ archive_mtime = zip_path.stat().st_mtime_ns
|
||||
+ # If directory wasn't modified since archive was created, skip state computation
|
||||
+ if dir_mtime <= archive_mtime:
|
||||
+ quick_skipped += 1
|
||||
+ if quick_skipped <= 5:
|
||||
+ log("scan", f"Skipping {rel} (unchanged since archive)")
|
||||
+ # Still need to check for old .zip cleanup (we have .7z, so .zip is obsolete)
|
||||
+ if old_zip_path and old_zip_path.exists():
|
||||
+ old_zip_path.unlink(missing_ok=True)
|
||||
+ old_state_path = state_path_for(old_zip_path)
|
||||
+ if old_state_path.exists():
|
||||
+ old_state_path.unlink(missing_ok=True)
|
||||
+ continue
|
||||
except OSError:
|
||||
# If stat fails, fall through to full state computation
|
||||
```
|
||||
|
||||
</details>
|
||||
|
||||
</tool-use>
|
||||
|
||||
---
|
||||
|
||||
|
||||
|
||||
---
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
---
|
||||
|
||||
<tool-use data-tool-type="generic" data-tool-name="read_lints">
|
||||
<details>
|
||||
<summary>Tool use: **read_lints** • Read lints for 1 file</summary>
|
||||
|
||||
Lint paths:
|
||||
|
||||
- `zip_sequences.py`
|
||||
|
||||
Lint results
|
||||
|
||||
|
||||
**No lint errors found**
|
||||
</details>
|
||||
</tool-use>
|
||||
|
||||
---
|
||||
|
||||
|
||||
|
||||
---
|
||||
|
||||
|
||||
Fixed the issue. The quick-skip optimization was skipping state computation when the directory mtime was <= the archive mtime, which can miss file deletions if the directory mtime doesn't update.
|
||||
|
||||
**The fix:** Before the quick-skip, the code now verifies that all files in the stored state still exist. If any are missing (like the JPGs you removed), it skips the quick-skip and recomputes the full state, which will detect the change and update the manifest.
|
||||
|
||||
Now when you run `zipseq`:
|
||||
1. It checks if stored files still exist
|
||||
2. If JPGs are missing, it recomputes the state
|
||||
3. It detects the difference and updates the manifest
|
||||
4. It recompresses the archive with only the remaining PNGs
|
||||
|
||||
The manifest should now update correctly when files are removed, even if the directory mtime doesn't change.
|
||||
|
||||
---
|
||||
|
||||
|
||||
@@ -1029,7 +1029,21 @@ def run_zip(requested_workers: int | None, *, verbose: bool) -> int:
|
||||
# If .7z archive exists and we have stored state, do quick check before computing full state
|
||||
if zip_path.exists() and stored_state is not None:
|
||||
# Quick check: if directory mtime is older than archive, likely unchanged
|
||||
# But first verify that all files in stored state still exist (catches deletions)
|
||||
try:
|
||||
# Quick check: verify stored files still exist (catches file deletions)
|
||||
stored_files_exist = True
|
||||
for entry in stored_state.get("files", []):
|
||||
# Skip Thumbs.db entries (they're filtered out anyway)
|
||||
if Path(entry.get("path", "")).name.lower() == "thumbs.db":
|
||||
continue
|
||||
file_path = seq_dir / entry["path"]
|
||||
if not file_path.exists():
|
||||
stored_files_exist = False
|
||||
break
|
||||
|
||||
# Only do mtime check if all stored files still exist
|
||||
if stored_files_exist:
|
||||
dir_mtime = seq_dir.stat().st_mtime_ns
|
||||
archive_mtime = zip_path.stat().st_mtime_ns
|
||||
# If directory wasn't modified since archive was created, skip state computation
|
||||
|
||||
Reference in New Issue
Block a user