82 Commits

Author SHA1 Message Date
Dylan Knutson
fa94d90474 unique index on ib_id inkbunny posts 2025-09-08 02:28:24 +00:00
Dylan Knutson
7f7728366b Sep 2025 posted_at fixes 2025-09-07 18:27:22 +00:00
Dylan Knutson
1905575d19 did searching 2025-09-07 17:47:53 +00:00
Dylan Knutson
3021bc4a97 remove e621 user post fav rows migration 2025-08-20 23:47:03 +00:00
Dylan Knutson
deff73d73a Update task task-94 2025-08-20 22:14:04 +00:00
Dylan Knutson
9e0e8a8d3c Create task task-94 2025-08-20 22:13:51 +00:00
Dylan Knutson
8bd6c4b2ae migrate e621 favs to own table 2025-08-20 22:10:57 +00:00
Dylan Knutson
6381067235 rename fa fav enum to invalid 2025-08-20 18:37:24 +00:00
Dylan Knutson
9b13bec296 media not found fixes, rename unused enum 2025-08-20 15:00:54 +00:00
Dylan Knutson
d1f791d598 Update task task-93 2025-08-20 06:41:11 +00:00
Dylan Knutson
7243635238 Create task task-93 2025-08-20 06:40:19 +00:00
Dylan Knutson
a76b10634e more migration of fa user favs to own table 2025-08-20 04:58:06 +00:00
Dylan Knutson
7bca1452e4 Update task task-92 2025-08-20 03:23:43 +00:00
Dylan Knutson
26d82fca77 Create task task-92 2025-08-20 03:23:24 +00:00
Dylan Knutson
d2789f47dc separate migration for user_post_favs_fa table indexes 2025-08-19 15:53:44 +00:00
Dylan Knutson
4830a4ce54 separate table for fa post favs 2025-08-19 01:22:56 +00:00
Dylan Knutson
7f521b30e9 inkbunny missing posts task fixes 2025-08-18 23:58:06 +00:00
Dylan Knutson
a68e5b0112 bsky fixes, ib missing post enqueuer 2025-08-18 16:28:53 +00:00
Dylan Knutson
8376dfc662 ignore txt files 2025-08-18 05:59:58 +00:00
Dylan Knutson
cb3b52bf41 handle 400 response for users from bsky 2025-08-18 05:59:26 +00:00
Dylan Knutson
8e98a5ee4b remove unused indexes 2025-08-18 01:43:46 +00:00
Dylan Knutson
a8f258d5ef fix frozen string bug, add bsky username prefix to searched names 2025-08-17 19:08:44 +00:00
Dylan Knutson
15ea73a350 fix bsky user profile link sanitizer 2025-08-17 18:51:55 +00:00
Dylan Knutson
6bf64cf8c6 Create task task-91 2025-08-17 08:17:16 +00:00
Dylan Knutson
c5dc181187 Update task task-90 2025-08-17 08:11:40 +00:00
Dylan Knutson
b6e3912ccb Create task task-90 2025-08-17 08:09:58 +00:00
Dylan Knutson
73f6f77596 Add comprehensive Bluesky tests to posts_helper_spec
- Add extensive test coverage for Bluesky user profile URL matching
- Test handle-based and DID-based profile URLs with various formats
- Add edge cases and error condition tests for malformed URLs
- Test user avatar icon path and model path generation
- Verify fallback behavior for users without display names
- Test priority logic for handle vs DID lookup
- Add tests for special characters and very long handles
- All 82 tests now pass successfully
2025-08-17 00:10:31 +00:00
Dylan Knutson
cdcd574d02 monitor bsky user button 2025-08-16 21:27:55 +00:00
Dylan Knutson
8d6953c758 update task 88 2025-08-16 19:30:31 +00:00
Dylan Knutson
558c4f940e tsc fix 2025-08-16 19:26:03 +00:00
Dylan Knutson
c1b63275e8 show number of files associated with post if > 1 2025-08-16 19:23:35 +00:00
Dylan Knutson
fd97d145cb spec for showing users faving post 2025-08-16 19:09:48 +00:00
Dylan Knutson
87fda1a475 Update task task-89 2025-08-16 19:06:02 +00:00
Dylan Knutson
cbb08ba8c0 Create task task-89 2025-08-16 19:05:51 +00:00
Dylan Knutson
130d77419a Update task task-88 2025-08-16 19:04:03 +00:00
Dylan Knutson
598993abaf Create task task-88 2025-08-16 19:03:57 +00:00
Dylan Knutson
d06347a771 extract bsky posts/users from e621 2025-08-16 18:59:39 +00:00
Dylan Knutson
0fd4d13673 unique check for url name 2025-08-16 06:53:39 +00:00
Dylan Knutson
df02fd3077 bsky 422, 500, 504 2025-08-16 06:35:39 +00:00
Dylan Knutson
5b12e28fb7 bsky 422 2025-08-16 05:23:09 +00:00
Dylan Knutson
64a65d1490 workaround for bsky avatar incorret url 2025-08-16 05:17:26 +00:00
Dylan Knutson
a1fab9e645 bmp support, buggy fa user, url decode usernames 2025-08-16 04:44:04 +00:00
Dylan Knutson
1e46e42352 INFO log levels for staging/development 2025-08-15 22:15:19 +00:00
Dylan Knutson
43876ef7c4 use getProfile for scan user job 2025-08-15 22:15:09 +00:00
Dylan Knutson
4d456ee73d Improve logging and add rake task for monitoring user follows
- Enhanced logging format in scan_user_follows_job and monitor tasks using format_tags
- Added new rake task 'bluesky:watch_follows' to monitor users that a given user follows
- Improved log formatting consistency across Bluesky monitoring components
2025-08-15 21:55:18 +00:00
Dylan Knutson
b6e2e5e502 Update telegram bot task, user view, and type definitions
- Modified telegram bot task implementation
- Updated domain users index view
- Updated telegram-bot-ruby type shims
2025-08-15 05:59:11 +00:00
Dylan Knutson
2acf31c70a use html as telegram bot parse mode 2025-08-14 21:31:53 +00:00
Dylan Knutson
3c83ed3ba7 fixes for bsky monitoring 2025-08-14 21:13:37 +00:00
Dylan Knutson
1058a53d18 montior hashtag impl 2025-08-14 20:48:19 +00:00
Dylan Knutson
5646e388be base structure for monitoring hashtags 2025-08-14 20:35:15 +00:00
Dylan Knutson
c1310c6dcc build vips from source 2025-08-14 19:54:34 +00:00
Dylan Knutson
62f14d10d4 visual search fixes 2025-08-14 19:29:28 +00:00
Dylan Knutson
2a8d631b29 split common visual search logic out 2025-08-14 19:11:13 +00:00
Dylan Knutson
90d2cce076 Update task task-87 2025-08-14 18:58:14 +00:00
Dylan Knutson
db6f2ce92e Create task task-87 2025-08-14 18:56:58 +00:00
Dylan Knutson
ca937eb2bc process mp4 file thumbnailing 2025-08-14 18:16:14 +00:00
Dylan Knutson
981bea5016 Update task task-86 2025-08-14 18:14:28 +00:00
Dylan Knutson
66e97ba5c7 Update task task-86 2025-08-14 18:13:07 +00:00
Dylan Knutson
7d07a18a80 Create task task-86 2025-08-14 18:12:57 +00:00
Dylan Knutson
e9ac97be29 split out common bsky post creation logic into Bluesky::ProcessPostHelper 2025-08-14 17:55:17 +00:00
Dylan Knutson
cfffe50541 add monitor scanned at to bsky monitor 2025-08-14 17:16:21 +00:00
Dylan Knutson
1d248c1f23 user follows/followed by scans for bluesky 2025-08-14 17:03:50 +00:00
Dylan Knutson
e1933104b3 Create task task-85 2025-08-14 16:04:46 +00:00
Dylan Knutson
419a1503f2 Update task task-84 2025-08-13 08:23:46 +00:00
Dylan Knutson
9a113fe2be Update task task-84 2025-08-13 08:23:44 +00:00
Dylan Knutson
c78dd401c7 Create task task-84 2025-08-13 08:23:39 +00:00
Dylan Knutson
b33a267a83 by descending post id 2025-08-13 08:20:32 +00:00
Dylan Knutson
6bb0b255fb touch user model after scanning posts 2025-08-12 23:15:19 +00:00
Dylan Knutson
1357eb9095 improve monitor 2025-08-12 23:05:41 +00:00
Dylan Knutson
dea2071662 better user list 2025-08-12 22:52:52 +00:00
Dylan Knutson
6df6f63060 bsky user registerd at scanning 2025-08-12 22:27:22 +00:00
Dylan Knutson
420a44a27d bsky page scanning auditing 2025-08-12 21:56:05 +00:00
Dylan Knutson
2de7f85a99 bsky descriptions with newlines 2025-08-12 21:33:40 +00:00
Dylan Knutson
171ddd430b misc fixes for bsky 2025-08-12 21:22:51 +00:00
Dylan Knutson
ad0675a9aa Add Bluesky post helper with facet rendering and external link support
- Add BlueskyPostHelper for rendering Bluesky post facets (mentions, links, hashtags)
- Implement facet parsing and rendering with proper styling
- Add external link partial for non-Bluesky URLs
- Update DisplayedFile and PostFiles components to handle Bluesky posts
- Add comprehensive test coverage for helper methods
- Update scan user job to handle Bluesky-specific data
2025-08-12 20:43:08 +00:00
Dylan Knutson
d08c896d97 show reply / quotes for bsky posts 2025-08-12 18:31:17 +00:00
Dylan Knutson
127dd9be51 Add Bluesky file display components and utilities
- Add SkySection component for displaying Bluesky-specific file information
- Add byteCountToHumanSize utility for formatting file sizes
- Update PostFiles, FileCarousel, FileDetails, and DisplayedFile components
- Enhance posts helper with file display logic
- Update post model and view templates
- Remove deprecated file details sky section partial
2025-08-12 18:14:13 +00:00
Dylan Knutson
390f0939b0 video post downloading 2025-08-12 00:24:32 +00:00
Dylan Knutson
40c6d44100 Convert ScanPostsJob tests to use SpecUtil.enqueued_job_args and add rescan tests
- Convert existing job mocking to use SpecUtil.enqueued_job_args helper
- Remove allow(Domain::StaticFileJob).to receive(:perform_later) mocking
- Add comprehensive test context for rescanning users with pending files
- Create domain_post_file_bluesky_post_file factory for test objects
- Add tests verifying enqueue_pending_files_job behavior during rescans
- Ensure only pending files get jobs enqueued, not already processed files
- Use force_scan: true to bypass scan frequency limits in tests
2025-08-10 20:49:26 +00:00
Dylan Knutson
ded26741a8 test: update FA job specs and add test fixtures
- Update FurAffinity favs job spec with improved test coverage
- Update user gallery job spec with enhanced testing
- Add new test fixtures for FA favorites and gallery parsing
- Add minimal test fixtures for better test performance
- Update .cursorrules with latest development guidelines
2025-08-10 19:57:04 +00:00
Dylan Knutson
eba4b58666 feat: implement Bluesky scan posts job and enhance user scanning
- Add new ScanPostsJob for scanning Bluesky posts
- Enhance ScanUserJob with improved error handling and logging
- Update BlueskyPost model with new fields and validation
- Add auxiliary tables for Bluesky posts
- Improve job base classes with better color logging
- Update specs with proper HTTP mocking patterns
- Add factory for BlueskyPost testing
2025-08-10 18:41:01 +00:00
Dylan Knutson
5c71fc6b15 Add avatar downloading to Bluesky scan user job
- Modified process_user_avatar method to enqueue Domain::UserAvatarJob for avatar downloads
- Made Domain::UserAvatarJob concrete (removed abstract!) with generic HTTP client
- Added smart avatar management: handles new avatars, URL changes, and pending re-enqueues
- Added comprehensive test coverage for all avatar scenarios
- Updated HTTP mocking in specs to use HttpClientMockHelpers pattern
- Fixed caused_by_entry handling for chained HTTP requests
- Updated .cursorrules with proper HTTP mocking documentation including caused_by_entry: :any

The job now automatically downloads user avatars when scanning Bluesky users.
2025-08-09 01:23:16 +00:00
235 changed files with 22856 additions and 2337 deletions

View File

@@ -6,3 +6,27 @@ log
public
.bundle
gems
# Generated/build artifacts
node_modules
user_scripts/dist
app/assets/builds
vendor/javascript
# Sorbet generated files
sorbet/tapioca
sorbet/rbi/gems
sorbet/rbi/annotations
sorbet/rbi/dsl
# Configuration files with secrets
config/credentials.yml.enc
config/master.key
# Lock files
yarn.lock
Gemfile.lock
# Documentation
TODO.md
*.notes.md

View File

@@ -10,6 +10,9 @@
- For instance, if you modify `app/models/domain/post.rb`, run `bin/rspec spec/models/domain/post_spec.rb`. If you modify `app/views/domain/users/index.html.erb`, run `bin/rspec spec/controllers/domain/users_controller_spec.rb`.
- At the end of a long series of changes, run `just test`.
- If specs are failing, then fix the failures, and rerun with `bin/rspec <path_to_spec_file>`.
- If you need to add logging to a Job to debug it, set `quiet: false` on the spec you are debugging.
- Fish shell is used for development, not bash.
- When running scratch commands, use `bin/rails runner`, not `bin/rails console`.
# Typescript Development
@@ -17,6 +20,75 @@
- Styling is done with Tailwind CSS and FontAwesome.
- Put new typescript files in `app/javascript/bundles/Main/components/`
# HTTP Mocking in Job Specs
When writing specs for jobs that make HTTP requests, use `HttpClientMockHelpers.init_with()` instead of manually creating doubles:
```ruby
# CORRECT: Use HttpClientMockHelpers.init_with
let(:client_mock_config) do
[
{
uri: "https://example.com/api/first-endpoint",
status_code: 200,
content_type: "application/json",
contents: first_response_body,
},
{
uri: "https://example.com/api/second-endpoint",
status_code: 200,
content_type: "application/json",
contents: second_response_body,
caused_by_entry: :any, # Use this for chained requests
},
]
end
before { @log_entries = HttpClientMockHelpers.init_with(client_mock_config) }
# WRONG: Don't create doubles manually
expect(http_client_mock).to receive(:get).and_return(
double(status_code: 200, body: response_body, log_entry: double),
)
# WRONG: Don't use the old init_http_client_mock method
@log_entries =
HttpClientMockHelpers.init_http_client_mock(
http_client_mock,
client_mock_config,
)
```
This pattern:
- Uses the preferred `init_with` helper method
- Automatically uses the global `http_client_mock` from `spec_helper.rb`
- Creates real HttpLogEntry objects that can be serialized by ActiveJob
- Follows the established codebase pattern
- Avoids "Unsupported argument type: RSpec::Mocks::Double" errors
- Use `caused_by_entry: :any` for HTTP requests that are chained (where one request's log entry becomes the `caused_by_entry` for the next request)
- No need to manually set up `http_client_mock` - it's handled globally in `spec_helper.rb`
# Job Enqueuing Verification in Specs
Use `SpecUtil.enqueued_job_args()` instead of mocking `perform_later`:
```ruby
# CORRECT: Test actual job enqueuing
enqueued_jobs = SpecUtil.enqueued_job_args(SomeJob)
expect(enqueued_jobs).to contain_exactly(hash_including(user: user))
expect(enqueued_jobs).to be_empty # For no jobs
# WRONG: Don't mock perform_later (breaks with .set chaining)
expect(SomeJob).to receive(:perform_later)
```
Benefits: More robust, tests actual behavior, no cleanup needed (tests run in transactions).
# Testing Jobs
When writing specs for jobs e.g. Domain::Site::SomethingJob, do not invoke `job.perform(...)` directly, always use `perform_now(...)` (defined in spec/helpers/perform_job_helpers.rb)
# === BACKLOG.MD GUIDELINES START ===
# Instructions for the usage of Backlog.md CLI Tool

View File

@@ -33,7 +33,6 @@ RUN --mount=type=cache,target=/var/cache/apt,sharing=locked \
libreoffice \
libsqlite3-dev \
libssl-dev \
libvips42 \
libyaml-dev \
patch \
pdftohtml \
@@ -43,6 +42,57 @@ RUN --mount=type=cache,target=/var/cache/apt,sharing=locked \
watchman \
zlib1g-dev
# Install vips dependencies
RUN --mount=type=cache,target=/var/cache/apt,sharing=locked \
--mount=type=cache,target=/var/lib/apt,sharing=locked \
apt-get update && \
apt-get install --no-install-recommends --no-install-suggests -qqy \
automake \
gtk-doc-tools \
gobject-introspection \
libgirepository1.0-dev \
libglib2.0-dev \
libexpat1-dev \
libjpeg-dev \
libpng-dev \
libtiff5-dev \
libwebp-dev \
libheif-dev \
libexif-dev \
liblcms2-dev \
libxml2-dev \
libfftw3-dev \
liborc-0.4-dev \
libcgif-dev \
libjxl-dev \
libopenjp2-7-dev \
meson \
ninja-build
# Install imagemagick from source
RUN cd /tmp && \
wget -qO- https://imagemagick.org/archive/releases/ImageMagick-7.1.2-1.tar.xz | tar -xJ && \
cd ImageMagick-7.1.2-1 && \
./configure && \
make -j$(nproc) && \
make install && \
ldconfig && \
cd / && \
rm -rf /tmp/ImageMagick-7.1.2-1*
# Install vips from source
RUN cd /tmp && \
wget -qO- https://github.com/libvips/libvips/releases/download/v8.17.1/vips-8.17.1.tar.xz | tar -xJ && \
cd vips-8.17.1 && \
meson setup build --prefix=/usr/local -Dcgif=enabled && \
cd build && \
ninja && \
ninja install && \
ldconfig && \
cd / && \
rm -rf /tmp/vips-8.17.1*
# Install postgres 15 client
RUN --mount=type=cache,target=/var/cache/apt,sharing=locked \
--mount=type=cache,target=/var/lib/apt,sharing=locked \

2
.gitignore vendored
View File

@@ -15,7 +15,7 @@ migrated_files.txt
package-lock.json
*.notes.md
*.txt
# Ignore bundler config.
/.bundle

View File

@@ -32,7 +32,6 @@ RUN \
--mount=type=cache,target=/var/lib/apt,sharing=locked \
apt-get update && \
apt-get install --no-install-recommends --no-install-suggests -y \
libvips42 \
ca-certificates \
curl \
gnupg \
@@ -44,6 +43,56 @@ RUN \
pdftohtml \
libreoffice
# Install vips dependencies
RUN --mount=type=cache,target=/var/cache/apt,sharing=locked \
--mount=type=cache,target=/var/lib/apt,sharing=locked \
apt-get update && \
apt-get install --no-install-recommends --no-install-suggests -qqy \
automake \
gtk-doc-tools \
gobject-introspection \
libgirepository1.0-dev \
libglib2.0-dev \
libexpat1-dev \
libjpeg-dev \
libpng-dev \
libtiff5-dev \
libwebp-dev \
libheif-dev \
libexif-dev \
liblcms2-dev \
libxml2-dev \
libfftw3-dev \
liborc-0.4-dev \
libcgif-dev \
libjxl-dev \
libopenjp2-7-dev \
meson \
ninja-build
# Install imagemagick from source
RUN cd /tmp && \
wget -qO- https://imagemagick.org/archive/releases/ImageMagick-7.1.2-1.tar.xz | tar -xJ && \
cd ImageMagick-7.1.2-1 && \
./configure && \
make -j$(nproc) && \
make install && \
ldconfig && \
cd / && \
rm -rf /tmp/ImageMagick-7.1.2-1*
# Install vips from source
RUN cd /tmp && \
wget -qO- https://github.com/libvips/libvips/releases/download/v8.17.1/vips-8.17.1.tar.xz | tar -xJ && \
cd vips-8.17.1 && \
meson setup build --prefix=/usr/local -Dcgif=enabled && \
cd build && \
ninja && \
ninja install && \
ldconfig && \
cd / && \
rm -rf /tmp/vips-8.17.1*
WORKDIR /usr/src/app
COPY Gemfile Gemfile.lock ./
COPY gems/has_aux_table ./gems/has_aux_table

View File

@@ -1,4 +1,4 @@
rails: RAILS_ENV=development HTTP_PORT=3001 rdbg --command --nonstop --open -- thrust ./bin/rails server
rails: RAILS_ENV=development HTTP_PORT=3001 thrust ./bin/rails server
wp-client: RAILS_ENV=development HMR=true ./bin/webpacker-dev-server
wp-server: RAILS_ENV=development HMR=true SERVER_BUNDLE_ONLY=yes ./bin/webpacker --watch
css: tailwindcss -c ./config/tailwind.config.js -i ./app/assets/stylesheets/application.tailwind.css -o ./app/assets/builds/tailwind.css --watch

View File

@@ -18,7 +18,7 @@ class Domain::PostsController < DomainController
visual_results
]
before_action :set_post!, only: %i[show]
before_action :set_user!, only: %i[user_favorite_posts user_created_posts]
before_action :set_user!, only: %i[user_created_posts]
before_action :set_post_group!, only: %i[posts_in_group]
class PostsIndexViewConfig < T::ImmutableStruct
@@ -65,29 +65,6 @@ class Domain::PostsController < DomainController
authorize @post
end
sig(:final) { void }
def user_favorite_posts
@posts_index_view_config =
PostsIndexViewConfig.new(
show_domain_filters: false,
show_creator_links: true,
index_type_header: "user_favorites",
)
@user = T.must(@user)
authorize @user
@posts = @user.faved_posts
@post_favs =
Domain::UserPostFav.where(user: @user, post: @posts).index_by(&:post_id)
# Apply pagination through posts_relation
@posts = posts_relation(@posts, skip_ordering: true)
authorize @posts
render :index
end
sig(:final) { void }
def user_created_posts
@posts_index_view_config =
@@ -147,21 +124,35 @@ class Domain::PostsController < DomainController
authorize Domain::Post
# Process the uploaded image or URL
image_result = process_image_input
return unless image_result
image_path, content_type = image_result
file_result = process_image_input
return unless file_result
file_path, content_type = file_result
# Create thumbnail for the view if possible
@uploaded_image_data_uri = create_thumbnail(image_path, content_type)
@uploaded_hash_value = generate_fingerprint(image_path)
@uploaded_detail_hash_value = generate_detail_fingerprint(image_path)
tmp_dir = Dir.mktmpdir("visual-search")
thumbs_and_fingerprints =
helpers.generate_fingerprints(file_path, content_type, tmp_dir)
first_thumb_and_fingerprint = thumbs_and_fingerprints&.first
if thumbs_and_fingerprints.nil? || first_thumb_and_fingerprint.nil?
flash.now[:error] = "Error generating fingerprints"
render :visual_search
return
end
logger.info("generated #{thumbs_and_fingerprints.length} thumbs")
@uploaded_image_data_uri =
helpers.create_image_thumbnail_data_uri(
first_thumb_and_fingerprint.thumb_path,
"image/jpeg",
)
@uploaded_detail_hash_value = first_thumb_and_fingerprint.detail_fingerprint
before = Time.now
similar_fingerprints =
helpers.find_similar_fingerprints(
fingerprint_value: @uploaded_hash_value,
fingerprint_detail_value: @uploaded_detail_hash_value,
thumbs_and_fingerprints.map(&:to_fingerprint_and_detail),
).take(10)
@time_taken = Time.now - before
@matches = similar_fingerprints
@@ -173,10 +164,7 @@ class Domain::PostsController < DomainController
@matches = @good_matches if @good_matches.any?
ensure
# Clean up any temporary files
if @temp_file
@temp_file.unlink
@temp_file = nil
end
FileUtils.rm_rf(tmp_dir) if tmp_dir
end
private
@@ -240,27 +228,6 @@ class Domain::PostsController < DomainController
nil
end
# Create a thumbnail from the image and return the data URI
sig do
params(image_path: String, content_type: String).returns(T.nilable(String))
end
def create_thumbnail(image_path, content_type)
helpers.create_image_thumbnail_data_uri(image_path, content_type)
end
# Generate a fingerprint from the image path
sig { params(image_path: String).returns(String) }
def generate_fingerprint(image_path)
# Use the new from_file_path method to create a fingerprint
Domain::PostFile::BitFingerprint.from_file_path(image_path)
end
# Generate a detail fingerprint from the image path
sig { params(image_path: String).returns(String) }
def generate_detail_fingerprint(image_path)
Domain::PostFile::BitFingerprint.detail_from_file_path(image_path)
end
sig { override.returns(DomainController::DomainParamConfig) }
def self.param_config
DomainController::DomainParamConfig.new(
@@ -281,10 +248,7 @@ class Domain::PostsController < DomainController
def posts_relation(starting_relation, skip_ordering: false)
relation = starting_relation
relation = T.unsafe(policy_scope(relation)).page(params[:page]).per(50)
relation =
relation.order(
relation.klass.post_order_attribute => :desc,
) unless skip_ordering
relation = relation.order("posted_at DESC NULLS LAST") unless skip_ordering
relation
end
end

View File

@@ -0,0 +1,29 @@
# typed: true
# frozen_string_literal: true
class Domain::UserPostFavsController < DomainController
before_action :set_user!, only: %i[favorites]
def self.param_config
DomainParamConfig.new(
post_id_param: :domain_post_id,
user_id_param: :domain_user_id,
post_group_id_param: :domain_post_group_id,
)
end
sig { void }
def favorites
@posts_index_view_config =
Domain::PostsController::PostsIndexViewConfig.new(
show_domain_filters: false,
show_creator_links: true,
index_type_header: "user_favorites",
)
user = T.cast(@user, Domain::User)
@user_post_favs =
user.user_post_favs.includes(:post).page(params[:page]).per(50)
authorize @user_post_favs
render :favorites
end
end

View File

@@ -3,7 +3,8 @@ class Domain::UsersController < DomainController
extend T::Sig
extend T::Helpers
before_action :set_user!, only: %i[show followed_by following]
before_action :set_user!,
only: %i[show followed_by following monitor_bluesky_user]
before_action :set_post!, only: %i[users_faving_post]
skip_before_action :authenticate_user!,
only: %i[
@@ -75,6 +76,24 @@ class Domain::UsersController < DomainController
authorize Domain::User
name = params[:name]&.downcase
name = ReduxApplicationRecord.sanitize_sql_like(name)
if name.starts_with?("did:plc:") || name.starts_with?("did:pkh:")
@user_search_names =
Domain::UserSearchName
.select(
"domain_user_search_names.*, domain_users.*, domain_users_bluesky_aux.did",
)
.select(
"levenshtein(domain_users_bluesky_aux.did, '#{name}') as distance",
)
.where(
user: Domain::User::BlueskyUser.where("did LIKE ?", "#{name}%"),
)
.joins(:user)
.limit(10)
return
end
@user_search_names =
Domain::UserSearchName
.select("domain_user_search_names.*, domain_users.*")
@@ -167,6 +186,23 @@ class Domain::UsersController < DomainController
}
end
sig { void }
def monitor_bluesky_user
user = T.cast(@user, Domain::User::BlueskyUser)
authorize user
monitor = Domain::Bluesky::MonitoredObject.build_for_user(user)
if monitor.save
Domain::Bluesky::Job::ScanUserJob.perform_later(user:)
Domain::Bluesky::Job::ScanPostsJob.perform_later(user:)
flash[:notice] = "User is now being monitored"
else
flash[
:alert
] = "Error monitoring user: #{monitor.errors.full_messages.join(", ")}"
end
redirect_to domain_user_path(user)
end
private
sig { override.returns(DomainController::DomainParamConfig) }

View File

@@ -0,0 +1,210 @@
# typed: strict
# frozen_string_literal: true
module Domain::BlueskyPostHelper
extend T::Sig
include ActionView::Helpers::UrlHelper
include HelpersInterface
include Domain::PostsHelper
class FacetPart < T::Struct
const :type, Symbol
const :value, String
end
sig do
params(text: String, facets: T.nilable(T::Array[T.untyped])).returns(
T.nilable(String),
)
end
def render_bsky_post_facets(text, facets = nil)
return text if facets.blank?
facets =
begin
facets.map { |facet| Bluesky::Text::Facet.from_hash(facet) }
rescue => e
Rails.logger.error("error parsing Bluesky facets: #{e.message}")
return text
end
result_parts = T.let([], T::Array[FacetPart])
last_end = 0
# Sort facets by start position to handle them in order
sorted_facets = facets.sort_by(&:byteStart)
sorted_facets.each do |facet|
if facet.byteStart < 0 || facet.byteEnd <= facet.byteStart ||
facet.byteEnd > text.bytesize
next
end
# Skip overlapping facets
next if facet.byteStart < last_end
# Add text before this facet
if facet.byteStart > last_end
before_text = text.byteslice(last_end, facet.byteStart - last_end)
if before_text
result_parts << FacetPart.new(type: :text, value: before_text)
end
end
# Extract the facet text using byteslice for accurate character extraction
facet_text =
text.byteslice(facet.byteStart, facet.byteEnd - facet.byteStart)
next unless facet_text # Skip if byteslice returns nil
# Process the facet
rendered_facet = render_facet(facet, facet_text)
result_parts << FacetPart.new(type: :facet, value: rendered_facet)
last_end = facet.byteEnd
end
# Add remaining text after the last facet
if last_end < text.bytesize
remaining_text = text.byteslice(last_end, text.bytesize - last_end)
if remaining_text
result_parts << FacetPart.new(type: :text, value: remaining_text)
end
end
result_parts
.map do |part|
case part.type
when :text
part.value.gsub("\n", "<br />")
when :facet
part.value
end
end
.join
.html_safe
end
private
sig do
params(facet: Bluesky::Text::Facet, facet_text: String).returns(String)
end
def render_facet(facet, facet_text)
return facet_text unless facet.features.any?
# Process the first feature (Bluesky facets typically have one feature per facet)
feature = facet.features.first
return facet_text unless feature.is_a?(Bluesky::Text::FacetFeature)
case feature
when Bluesky::Text::FacetFeatureMention
render_mention_facet(feature, facet_text)
when Bluesky::Text::FacetFeatureURI
render_link_facet(feature, facet_text)
when Bluesky::Text::FacetFeatureTag
render_tag_facet(feature, facet_text)
else
# Unknown facet type, return original text
facet_text
end
end
sig do
params(
feature: Bluesky::Text::FacetFeatureMention,
facet_text: String,
).returns(String)
end
def render_mention_facet(feature, facet_text)
did = feature.did
return facet_text unless did.present?
# Try to find the user in the database
user = Domain::User::BlueskyUser.find_by(did: did)
if user
# Render the inline user partial
render(
partial: "domain/has_description_html/inline_link_domain_user",
locals: {
user: user,
link_text: facet_text,
visual_style: "description-section-link-light",
},
)
else
# Render external link to Bluesky profile
render(
partial: "domain/has_description_html/external_link",
locals: {
link_text: facet_text,
url: "https://bsky.app/profile/#{did}",
},
)
end
end
sig do
params(feature: Bluesky::Text::FacetFeatureURI, facet_text: String).returns(
String,
)
end
def render_link_facet(feature, facet_text)
uri = feature.uri
return facet_text unless uri.present?
source = link_for_source(uri)
if source.present? && (model = source.model)
case model
when Domain::Post
return(
render(
partial: "domain/has_description_html/inline_link_domain_post",
locals: {
post: model,
link_text: facet_text,
visual_style: "description-section-link-light",
},
)
)
when Domain::User
return(
render(
partial: "domain/has_description_html/inline_link_domain_user",
locals: {
user: model,
link_text: facet_text,
visual_style: "description-section-link-light",
},
)
)
end
end
render(
partial: "domain/has_description_html/external_link",
locals: {
link_text: facet_text,
url: uri,
},
)
end
sig do
params(feature: Bluesky::Text::FacetFeatureTag, facet_text: String).returns(
String,
)
end
def render_tag_facet(feature, facet_text)
tag = feature.tag
return facet_text unless tag.present?
render(
partial: "domain/has_description_html/external_link",
locals: {
link_text: facet_text,
url: "https://bsky.app/hashtag/#{tag}",
},
)
end
end

View File

@@ -57,11 +57,16 @@ module Domain::DescriptionsHelper
end
WEAK_URL_MATCHER_REGEX =
%r{(http(s)?:\/\/.)?(www\.)?[-a-zA-Z0-9@:%._\+~#=]{2,256}\.[a-z]{2,6}\b([-a-zA-Z0-9@:%_\+.~#?&//=]*)}
%r{(http(s)?:\/\/)?(www\.)?[-a-zA-Z0-9@:%._\+~#=]{1,256}\.[a-z]{2,6}\b([-a-zA-Z0-9@:%_\+.~#?&//=]*)}
sig { params(str: String).returns(T.nilable(String)) }
def extract_weak_url(str)
str.match(WEAK_URL_MATCHER_REGEX)&.[](0)
end
ALLOWED_INFERRED_URL_DOMAINS =
T.let(
%w[furaffinity.net inkbunny.net e621.net]
%w[furaffinity.net inkbunny.net e621.net bsky.app]
.flat_map { |domain| [domain, "www.#{domain}"] }
.freeze,
T::Array[String],
@@ -72,6 +77,16 @@ module Domain::DescriptionsHelper
html = model.description_html_for_view
return nil if html.blank?
is_bsky_description = model.is_a?(Domain::User::BlueskyUser)
visual_style =
(
if model.is_a?(Domain::User::BlueskyUser)
"description-section-link-light"
else
"description-section-link"
end
)
case model
when Domain::Post::E621Post
dtext_result = DText.parse(html)
@@ -95,17 +110,23 @@ module Domain::DescriptionsHelper
next unless node.text?
next unless node.ancestors("a").empty?
next unless (node_text = T.cast(node.text, T.nilable(String)))
next unless (match = node_text.match(WEAK_URL_MATCHER_REGEX))
next unless (url_text = match[0])
next unless (url_text = extract_weak_url(node_text))
next if url_text.blank?
unless (
uri =
try_parse_uri(model.description_html_base_domain, url_text)
)
next
end
unless ALLOWED_PLAIN_TEXT_URL_DOMAINS.any? { |domain|
url_matches_domain?(domain, uri.host)
}
if is_bsky_description
unless ALLOWED_EXTERNAL_LINK_DOMAINS.any? { |domain|
url_matches_domain?(domain, uri.host)
}
next
end
elsif ALLOWED_PLAIN_TEXT_URL_DOMAINS.none? do |domain|
url_matches_domain?(domain, uri.host)
end
next
end
@@ -157,20 +178,12 @@ module Domain::DescriptionsHelper
when Domain::Post
[
"domain/has_description_html/inline_link_domain_post",
{
post: found_model,
link_text: node.text,
visual_style: "description-section-link",
},
{ post: found_model, link_text: node.text, visual_style: },
]
when Domain::User
[
"domain/has_description_html/inline_link_domain_user",
{
user: found_model,
link_text: node.text,
visual_style: "description-section-link",
},
{ user: found_model, link_text: node.text, visual_style: },
]
else
raise "Unknown model type: #{found_link.model.class}"
@@ -191,14 +204,24 @@ module Domain::DescriptionsHelper
end
replacements[node] = Nokogiri::HTML5.fragment(
render(
partial: "domain/has_description_html/inline_link_external",
locals: {
url: url.to_s,
title:,
icon_path: icon_path_for_domain(url.host),
},
),
if is_bsky_description
render(
partial: "domain/has_description_html/external_link",
locals: {
link_text: node.text,
url: url.to_s,
},
)
else
render(
partial: "domain/has_description_html/inline_link_external",
locals: {
url: url.to_s,
title:,
icon_path: icon_path_for_domain(url.host),
},
)
end,
)
next { node_whitelist: [node] }
end
@@ -259,6 +282,13 @@ module Domain::DescriptionsHelper
"rounded-md px-1 transition-all",
"inline-flex items-center align-bottom",
].join(" ")
when "description-section-link-light"
[
"text-sky-600 border-slate-300",
"border border-transparent hover:border-slate-500 hover:text-sky-800 hover:bg-slate-200",
"rounded-md px-1 transition-all",
"inline-flex items-center align-bottom",
].join(" ")
else
"blue-link"
end
@@ -320,13 +350,15 @@ module Domain::DescriptionsHelper
link_text: String,
visual_style: String,
domain_icon: T::Boolean,
link_params: T::Hash[Symbol, T.untyped],
).returns(T::Hash[Symbol, T.untyped])
end
def props_for_post_hover_preview(
post,
link_text,
visual_style,
domain_icon: true
domain_icon: true,
link_params: {}
)
cache_key = [
post,
@@ -341,7 +373,11 @@ module Domain::DescriptionsHelper
linkText: link_text,
postId: post.to_param,
postTitle: post.title,
postPath: Rails.application.routes.url_helpers.domain_post_path(post),
postPath:
Rails.application.routes.url_helpers.domain_post_path(
post,
link_params,
),
postThumbnailPath: thumbnail_for_post_path(post),
postThumbnailAlt: "View on #{domain_name_for_model(post)}",
postDomainIcon: domain_icon ? domain_model_icon_path(post) : nil,

View File

@@ -11,6 +11,7 @@ module Domain::DomainsHelper
e621.net
furaffinity.net
inkbunny.net
bsky.app
].freeze
# If a link is detected in an anchor tag and is one of these domains,

View File

@@ -196,81 +196,78 @@ module Domain::PostsHelper
sig do
params(
ok_files: T::Array[Domain::PostFile],
post_files: T::Array[Domain::PostFile],
initial_file_index: T.nilable(Integer),
).returns(T::Hash[Symbol, T.untyped])
end
def props_for_post_files(ok_files, initial_file_index: nil)
def props_for_post_files(post_files:, initial_file_index: nil)
files_data =
ok_files.map.with_index do |file, index|
post_files.map.with_index do |post_file, index|
thumbnail_path = nil
content_html = nil
file_details_html = nil
log_entry = post_file.log_entry
if file.log_entry&.status_code == 200
log_entry = file.log_entry
# Generate thumbnail path
begin
if log_entry && (response_sha256 = log_entry.response_sha256)
thumbnail_path =
if log_entry && (log_entry.status_code == 200)
if (response_sha256 = log_entry.response_sha256)
thumbnail_path = {
type: "url",
value:
blob_path(
HexUtil.bin2hex(response_sha256),
format: "jpg",
thumb: "small",
)
end
rescue StandardError
# thumbnail_path remains nil
),
}
end
# Generate content HTML
begin
content_html =
ApplicationController.renderer.render(
partial: "log_entries/content_container",
locals: {
log_entry: log_entry,
},
assigns: {
current_user: nil,
},
)
rescue StandardError
# content_html remains nil
end
# Generate file details HTML
begin
file_details_html =
ApplicationController.renderer.render(
partial: "log_entries/file_details_sky_section",
locals: {
log_entry: log_entry,
},
assigns: {
current_user: nil,
},
)
rescue StandardError
# file_details_html remains nil
end
content_html =
ApplicationController.renderer.render(
partial: "log_entries/content_container",
locals: {
log_entry: log_entry,
},
assigns: {
current_user: nil,
},
)
elsif post_file.state_pending?
thumbnail_path = {
type: "icon",
value: "fa-solid fa-file-arrow-down",
}
end
{
id: file.id,
id: post_file.id,
fileState: post_file.state,
thumbnailPath: thumbnail_path,
hasContent: file.log_entry&.status_code == 200,
hasContent: post_file.log_entry&.status_code == 200,
index: index,
contentHtml: content_html,
fileDetailsHtml: file_details_html,
fileDetails:
(
if log_entry
{
contentType: log_entry.content_type,
fileSize: log_entry.response_size,
responseTimeMs: log_entry.response_time_ms,
responseStatusCode: log_entry.status_code,
postFileState: post_file.state,
logEntryId: log_entry.id,
logEntryPath: log_entry_path(log_entry),
}
else
nil
end
),
}
end
# Validate initial_file_index
validated_initial_index = 0
validated_initial_index = nil
if initial_file_index && initial_file_index >= 0 &&
initial_file_index < ok_files.count
initial_file_index < post_files.count
validated_initial_index = initial_file_index
end
@@ -369,6 +366,7 @@ module Domain::PostsHelper
IB_HOSTS = %w[*.inkbunny.net inkbunny.net]
IB_CDN_HOSTS = %w[*.ib.metapix.net ib.metapix.net]
E621_HOSTS = %w[www.e621.net e621.net]
BLUESKY_HOSTS = %w[bsky.app]
URL_SUFFIX_QUERY = T.let(<<-SQL.strip.chomp.freeze, String)
lower('url_str') = lower(?)
@@ -494,6 +492,44 @@ module Domain::PostsHelper
end
end,
),
# Bluesky posts
SourceMatcher.new(
hosts: BLUESKY_HOSTS,
patterns: [%r{/profile/([^/]+)/post/([^/]+)/?$}],
find_proc: ->(helper, match, _) do
handle_or_did = match[1]
post_rkey = match[2]
if handle_or_did.start_with?("did:")
did = handle_or_did
else
user = Domain::User::BlueskyUser.find_by(handle: handle_or_did)
did = user&.did
end
next unless did
at_uri = "at://#{did}/app.bsky.feed.post/#{post_rkey}"
post = Domain::Post::BlueskyPost.find_by(at_uri:)
SourceResult.new(model: post, title: post.title_for_view) if post
end,
),
# Bluesky users
SourceMatcher.new(
hosts: BLUESKY_HOSTS,
patterns: [%r{/profile/([^/]+)\/?$}],
find_proc: ->(helper, match, _) do
handle_or_did = match[1]
user =
if handle_or_did.start_with?("did:")
Domain::User::BlueskyUser.find_by(did: handle_or_did)
else
Domain::User::BlueskyUser.find_by(handle: handle_or_did)
end
next unless user
SourceResult.new(
model: user,
title: user.name_for_view || handle_or_did,
)
end,
),
],
T::Array[SourceMatcher],
)
@@ -503,7 +539,7 @@ module Domain::PostsHelper
return nil if source.blank?
# normalize the source to a lowercase string with a protocol
source.downcase!
source = source.downcase
source = "https://" + source unless source.include?("://")
begin
uri = URI.parse(source)

View File

@@ -31,7 +31,7 @@ module Domain::UsersHelper
end
def domain_user_registered_at_ts_for_view(user)
case user
when Domain::User::FaUser, Domain::User::E621User
when Domain::User::FaUser, Domain::User::E621User, Domain::User::BlueskyUser
user.registered_at
else
nil
@@ -203,6 +203,27 @@ module Domain::UsersHelper
due_for_scan ? "fa-hourglass-half" : "fa-check"
end
if user.is_a?(Domain::User::BlueskyUser) && can_view_timestamps
rows << StatRow.new(
name: "Page scanned",
value: user.profile_scan,
link_to:
user.last_scan_log_entry && log_entry_path(user.last_scan_log_entry),
fa_icon_class: icon_for.call(user.profile_scan.due?),
hover_title: user.profile_scan.interval.inspect,
)
rows << StatRow.new(
name: "Posts scanned",
value: user.posts_scan,
link_to:
user.last_posts_scan_log_entry &&
log_entry_path(user.last_posts_scan_log_entry),
fa_icon_class: icon_for.call(user.posts_scan.due?),
hover_title: user.posts_scan.interval.inspect,
)
end
if user.is_a?(Domain::User::FaUser) && can_view_timestamps
if can_view_log_entries && hle = user.guess_last_user_page_log_entry
rows << StatRow.new(

View File

@@ -69,49 +69,122 @@ module Domain
end
class SimilarFingerprintResult < T::Struct
include T::Struct::ActsAsComparable
const :fingerprint, Domain::PostFile::BitFingerprint
const :similarity_percentage, Float
end
class FingerprintAndDetail < T::Struct
include T::Struct::ActsAsComparable
const :fingerprint, String
const :detail_fingerprint, String
end
# Find similar images based on the fingerprint
sig do
params(
fingerprint_value: String,
fingerprint_detail_value: String,
fingerprints: T::Array[FingerprintAndDetail],
limit: Integer,
oversearch: Integer,
includes: T.untyped,
).returns(T::Array[SimilarFingerprintResult])
end
def find_similar_fingerprints(
fingerprint_value:,
fingerprint_detail_value:,
fingerprints,
limit: 32,
oversearch: 2,
includes: {}
)
ActiveRecord::Base.connection.execute("SET ivfflat.probes = 20")
Domain::PostFile::BitFingerprint
.order(
Arel.sql "(fingerprint_value <~> '#{ActiveRecord::Base.connection.quote_string(fingerprint_value)}')"
)
.limit(limit * oversearch)
.includes(includes)
.to_a
.uniq(&:post_file_id)
.map do |other_fingerprint|
SimilarFingerprintResult.new(
fingerprint: other_fingerprint,
similarity_percentage:
calculate_similarity_percentage(
fingerprint_detail_value,
T.must(other_fingerprint.fingerprint_detail_value),
),
)
results =
fingerprints.flat_map do |f|
Domain::PostFile::BitFingerprint
.order(
Arel.sql "(fingerprint_value <~> '#{ActiveRecord::Base.connection.quote_string(f.fingerprint)}')"
)
.limit(limit * oversearch)
.includes(includes)
.to_a
.uniq(&:post_file_id)
.map do |other_fingerprint|
SimilarFingerprintResult.new(
fingerprint: other_fingerprint,
similarity_percentage:
calculate_similarity_percentage(
f.detail_fingerprint,
T.must(other_fingerprint.fingerprint_detail_value),
),
)
end
.sort { |a, b| b.similarity_percentage <=> a.similarity_percentage }
.take(limit)
end
.sort { |a, b| b.similarity_percentage <=> a.similarity_percentage }
.take(limit)
results
.group_by { |s| T.must(s.fingerprint.post_file_id) }
.map do |post_file_id, similar_fingerprints|
T.must(similar_fingerprints.max_by(&:similarity_percentage))
end
.sort_by(&:similarity_percentage)
.reverse
end
class GenerateFingerprintsResult < T::Struct
extend T::Sig
include T::Struct::ActsAsComparable
const :thumb_path, String
const :fingerprint, String
const :detail_fingerprint, String
sig { returns(FingerprintAndDetail) }
def to_fingerprint_and_detail
FingerprintAndDetail.new(
fingerprint: fingerprint,
detail_fingerprint: detail_fingerprint,
)
end
end
# Generate a fingerprint from the image path
sig do
params(image_path: String, content_type: String, tmp_dir: String).returns(
T.nilable(T::Array[GenerateFingerprintsResult]),
)
end
def generate_fingerprints(image_path, content_type, tmp_dir)
# Use the new from_file_path method to create a fingerprint
media = LoadedMedia.from_file(content_type, image_path)
return nil unless media
thumbnail_options =
LoadedMedia::ThumbnailOptions.new(
width: 128,
height: 128,
quality: 95,
size: :force,
interlace: false,
for_frames: [0.0, 0.1, 0.5, 0.9, 1.0],
)
frame_nums =
thumbnail_options
.for_frames
.map do |frame_fraction|
(frame_fraction * (media.num_frames - 1)).to_i
end
.uniq
.sort
frame_nums.map do |frame_num|
tmp_file = File.join(tmp_dir, "frame-#{frame_num}.jpg")
media.write_frame_thumbnail(frame_num, tmp_file, thumbnail_options)
GenerateFingerprintsResult.new(
thumb_path: tmp_file,
fingerprint:
Domain::PostFile::BitFingerprint.from_file_path(tmp_file),
detail_fingerprint:
Domain::PostFile::BitFingerprint.detail_from_file_path(tmp_file),
)
end
end
end
end

View File

@@ -8,6 +8,7 @@ module DomainSourceHelper
"furaffinity" => "Domain::Post::FaPost",
"e621" => "Domain::Post::E621Post",
"inkbunny" => "Domain::Post::InkbunnyPost",
"bluesky" => "Domain::Post::BlueskyPost",
}
end

View File

@@ -93,11 +93,18 @@ module GoodJobHelper
sig { params(job: GoodJob::Job).returns(T::Array[JobArg]) }
def arguments_for_job(job)
deserialized =
T.cast(
ActiveJob::Arguments.deserialize(job.serialized_params).to_h,
T::Hash[String, T.untyped],
begin
deserialized =
T.cast(
ActiveJob::Arguments.deserialize(job.serialized_params).to_h,
T::Hash[String, T.untyped],
)
rescue ActiveJob::DeserializationError => e
Rails.logger.error(
"error deserializing job arguments: #{e.class.name} - #{e.message}",
)
return [JobArg.new(key: :error, value: e.message, inferred: true)]
end
args_hash =
T.cast(deserialized["arguments"].first, T::Hash[Symbol, T.untyped])
args =

View File

@@ -16,17 +16,28 @@ export const DisplayedFile: React.FC<DisplayedFileProps> = ({ file }) => {
) : (
<section className="flex grow justify-center text-slate-500">
<div>
<i className="fa-solid fa-file-arrow-down"></i>
No file content available
<i className="fa-solid fa-file-arrow-down mr-2"></i>
{fileStateContent(file.fileState)}
</div>
</section>
)}
</div>
{/* File details */}
{file.fileDetailsHtml && <FileDetails html={file.fileDetailsHtml} />}
{file.fileDetails && <FileDetails {...file.fileDetails} />}
</>
);
};
function fileStateContent(fileState: FileData['fileState']) {
switch (fileState) {
case 'pending':
return 'File pending download';
case 'terminal_error':
return 'File download failed';
}
return 'No file content available';
}
export default DisplayedFile;

View File

@@ -40,7 +40,7 @@ export const FileCarousel: React.FC<FileCarouselProps> = ({
isSelected ? 'border-blue-500' : 'border-gray-300',
];
if (file.thumbnailPath) {
if (file.thumbnailPath?.type === 'url') {
buttonClasses.push('overflow-hidden');
} else {
buttonClasses.push(
@@ -51,6 +51,19 @@ export const FileCarousel: React.FC<FileCarouselProps> = ({
);
}
const thumbnail =
file.thumbnailPath?.type === 'url' ? (
<img
src={file.thumbnailPath.value}
className="h-full w-full object-cover"
alt={`File ${file.index + 1}`}
/>
) : file.thumbnailPath?.type === 'icon' ? (
<i className={`${file.thumbnailPath.value} text-slate-500`}></i>
) : (
<i className="fa-solid fa-file text-gray-400"></i>
);
return (
<button
key={file.id}
@@ -60,15 +73,7 @@ export const FileCarousel: React.FC<FileCarouselProps> = ({
data-index={file.index}
title={`File ${file.index + 1} of ${totalFiles}`}
>
{file.thumbnailPath ? (
<img
src={file.thumbnailPath}
className="h-full w-full object-cover"
alt={`File ${file.index + 1}`}
/>
) : (
<i className="fa-solid fa-file text-gray-400"></i>
)}
{thumbnail}
</button>
);
})}

View File

@@ -1,15 +1,112 @@
import * as React from 'react';
import { PostFileState } from './PostFiles';
import { byteCountToHumanSize } from '../utils/byteCountToHumanSize';
import SkySection from './SkySection';
interface FileDetailsProps {
html: string;
export interface FileDetailsProps {
contentType: string;
fileSize: number;
responseTimeMs: number;
responseStatusCode: number;
postFileState: PostFileState;
logEntryId: number;
logEntryPath: string;
}
export const FileDetails: React.FC<FileDetailsProps> = ({ html }) => {
export const FileDetails: React.FC<FileDetailsProps> = ({
contentType,
fileSize,
responseTimeMs,
responseStatusCode,
postFileState,
logEntryId,
logEntryPath,
}) => {
return (
<SkySection
title="File Details"
contentClassName="grid grid-cols-3 sm:grid-cols-6 text-sm"
>
<TitleStat
label="Type"
value={contentType}
iconClass="fa-solid fa-file"
/>
<TitleStat
label="Size"
value={byteCountToHumanSize(fileSize)}
iconClass="fa-solid fa-weight-hanging"
/>
<TitleStat
label="Time"
value={responseTimeMs == -1 ? undefined : `${responseTimeMs}ms`}
iconClass="fa-solid fa-clock"
/>
<TitleStat
label="Status"
value={responseStatusCode}
textClass={
responseStatusCode == 200 ? 'text-green-600' : 'text-red-600'
}
iconClass="fa-solid fa-signal"
/>
<TitleStat
label="State"
value={postFileState}
textClass={postFileState == 'ok' ? 'text-green-600' : 'text-red-600'}
iconClass="fa-solid fa-circle-check"
/>
<TitleStat label="Log Entry" iconClass="fa-solid fa-file-pen">
<a
href={logEntryPath}
target="_blank"
rel="noopener noreferrer"
className="font-medium text-blue-600 hover:text-blue-800"
>
#{logEntryId}
</a>
</TitleStat>
</SkySection>
);
};
const TitleStat: React.FC<{
label: string;
value?: string | number;
iconClass: string;
textClass?: string;
children?: React.ReactNode;
}> = ({ label, value, iconClass, textClass = 'text-slate-600', children }) => {
function valueElement(value: string | number | undefined) {
const defaultTextClass = 'font-normal';
if (value === undefined) {
return <span className="text-slate-500">&mdash;</span>;
} else if (typeof value === 'number') {
return (
<span className={`${textClass} ${defaultTextClass}`}>
{value.toLocaleString()}
</span>
);
} else {
return (
<span className={`${textClass} ${defaultTextClass}`}>{value}</span>
);
}
}
const gridInnerBorderClasses =
'border-r border-b border-slate-300 last:border-r-0 sm:last:border-r-0 [&:nth-child(3)]:border-r-0 sm:[&:nth-child(3)]:border-r [&:nth-last-child(-n+3)]:border-b-0 sm:[&:nth-last-child(-n+6)]:border-b-0';
return (
<div
className="file-details-section"
dangerouslySetInnerHTML={{ __html: html }}
/>
className={`flex flex-col justify-center px-2 py-1 ${gridInnerBorderClasses}`}
>
<div className="flex items-center gap-2 font-light text-slate-600">
<i className={iconClass}></i>
<span>{label}</span>
</div>
{children || valueElement(value)}
</div>
);
};

View File

@@ -10,6 +10,7 @@ const COMMON_LIST_ELEM_CLASSES = `
interface PropTypes {
value: string;
subvalue?: string;
subtext?: string;
thumb?: string;
isLast: boolean;
@@ -21,6 +22,7 @@ interface PropTypes {
export default function ListItem({
value,
subvalue,
thumb,
isLast,
selected,
@@ -54,7 +56,7 @@ export default function ListItem({
{style === 'error' && (
<Icon type="exclamation-circle" className={iconClassName.join(' ')} />
)}
<div className={textClassName.join(' ')}>
<div className={`${textClassName.join(' ')}`}>
<div className="inline-block w-8">
{thumb && (
<img
@@ -64,7 +66,28 @@ export default function ListItem({
/>
)}
</div>
<div className="inline-block flex-grow pl-1">{value}</div>
<div className="flex flex-grow flex-col pl-1">
<span
className={['text-lg font-light', subvalue && 'leading-tight']
.filter(Boolean)
.join(' ')}
>
{value}
</span>
{subvalue && (
<span
className={[
'text-sm font-normal group-hover:text-slate-200',
!selected && 'text-slate-500',
selected && 'text-slate-200',
]
.filter(Boolean)
.join(' ')}
>
{subvalue}
</span>
)}
</div>
{subtext && (
<div
className={[
@@ -80,7 +103,7 @@ export default function ListItem({
</div>
)}
{domainIcon && (
<img src={domainIcon} alt="domain icon" className="inline w-6" />
<img src={domainIcon} alt="domain icon" className="ml-1 inline w-6" />
)}
</div>
</a>

View File

@@ -2,14 +2,24 @@ import * as React from 'react';
import { useState, useEffect, useCallback } from 'react';
import { FileCarousel } from './FileCarousel';
import { DisplayedFile } from './DisplayedFile';
import { FileDetailsProps } from './FileDetails';
export type PostFileState =
| 'pending'
| 'ok'
| 'file_error'
| 'retryable_error'
| 'terminal_error'
| 'removed';
export interface FileData {
id: number;
thumbnailPath?: string;
fileState: PostFileState;
thumbnailPath?: { type: 'icon' | 'url'; value: string };
hasContent: boolean;
index: number;
contentHtml?: string;
fileDetailsHtml?: string;
fileDetails?: FileDetailsProps;
}
interface PostFilesProps {
@@ -19,8 +29,14 @@ interface PostFilesProps {
export const PostFiles: React.FC<PostFilesProps> = ({
files,
initialSelectedIndex = 0,
initialSelectedIndex,
}) => {
if (initialSelectedIndex == null) {
initialSelectedIndex = files.findIndex((file) => file.fileState === 'ok');
if (initialSelectedIndex === -1) {
initialSelectedIndex = 0;
}
}
const [selectedIndex, setSelectedIndex] = useState(initialSelectedIndex);
// Update URL parameter when selected file changes
@@ -82,6 +98,14 @@ export const PostFiles: React.FC<PostFilesProps> = ({
return (
<section id="file-display-section">
{files.length == 0 && (
<div className="flex grow justify-center text-slate-500">
<div className="flex items-center gap-2">
<i className="fa-solid fa-file-circle-exclamation"></i>
No files
</div>
</div>
)}
{files.length > 1 && (
<FileCarousel
files={files}

View File

@@ -0,0 +1,30 @@
import * as React from 'react';
export interface SkySectionProps {
title: string;
children?: React.ReactNode;
contentClassName?: string;
}
export const SkySection: React.FC<SkySectionProps> = ({
title,
children,
contentClassName,
}) => {
return (
<div className="sky-section w-full">
<SkySectionHeader title={title} />
<div className={contentClassName}>{children}</div>
</div>
);
};
export default SkySection;
export const SkySectionHeader: React.FC<SkySectionProps> = ({ title }) => {
return (
<div className="section-header flex items-center justify-between border-b py-2">
<span>{title}</span>
</div>
);
};

View File

@@ -33,6 +33,7 @@ interface PropTypes {
interface User {
id: number;
name: string;
name_secondary?: string;
thumb?: string;
show_path: string;
num_posts?: number;
@@ -198,13 +199,24 @@ export default function UserSearchBar({ isServerRendered }: PropTypes) {
) : null}
{visibility.items
? state.userList.map(
({ name, thumb, show_path, num_posts, domain_icon }, idx) => (
(
{
name,
name_secondary,
thumb,
show_path,
num_posts,
domain_icon,
},
idx,
) => (
<ListItem
key={'name-' + name}
isLast={idx == state.userList.length - 1}
selected={idx == state.selectedIdx}
style="item"
value={name}
subvalue={name_secondary}
thumb={thumb}
href={show_path}
subtext={num_posts ? `${num_posts.toString()} posts` : ''}

View File

@@ -15,6 +15,8 @@ interface ImageState {
file: File;
previewUrl: string;
originalFileSize: number | null;
thumbnailFile?: File; // For video thumbnails
isVideo?: boolean;
}
const ACCEPTED_IMAGE_TYPES = [
@@ -23,10 +25,11 @@ const ACCEPTED_IMAGE_TYPES = [
'image/jpg',
'image/gif',
'image/webp',
'video/mp4',
];
const ACCEPTED_EXTENSIONS =
'image/png,image/jpeg,image/jpg,image/gif,image/webp';
'image/png,image/jpeg,image/jpg,image/gif,image/webp,video/mp4';
// Feedback Message Component
interface FeedbackMessageProps {
@@ -126,23 +129,98 @@ async function resizeImageIfNeeded(file: File): Promise<File> {
});
}
// Helper function to generate thumbnail from video file
async function generateVideoThumbnail(file: File): Promise<File> {
return new Promise((resolve) => {
const video = document.createElement('video');
const canvas = document.createElement('canvas');
const ctx = canvas.getContext('2d')!;
video.onloadedmetadata = () => {
// Set canvas dimensions to match video
canvas.width = video.videoWidth;
canvas.height = video.videoHeight;
// Seek to 1 second into the video (or 10% through if shorter)
const seekTime = Math.min(1, video.duration * 0.1);
video.currentTime = seekTime;
};
video.onseeked = () => {
// Draw the current frame to canvas
ctx.drawImage(video, 0, 0, canvas.width, canvas.height);
// Convert to blob as JPEG
canvas.toBlob(
(blob) => {
if (blob) {
// Create a new file with thumbnail data but keep original video file name base
const thumbnailName =
file.name.replace(/\.[^/.]+$/, '') + '_thumbnail.jpg';
const thumbnailFile = new File([blob], thumbnailName, {
type: 'image/jpeg',
lastModified: Date.now(),
});
resolve(thumbnailFile);
} else {
resolve(file); // Fallback to original file
}
},
'image/jpeg',
0.8, // Quality setting (80%)
);
};
video.onerror = () => {
resolve(file); // Fallback to original file if video processing fails
};
// Load the video file
video.src = URL.createObjectURL(file);
video.load();
});
}
function ImagePreview({ imageState, onRemove }: ImagePreviewProps) {
const isVideo = imageState.isVideo;
return (
<div className="flex items-center gap-4">
<img
src={imageState.previewUrl}
alt="Selected image thumbnail"
className="max-h-32 max-w-32 flex-shrink-0 rounded-md object-cover shadow-sm"
/>
<div className="relative max-h-32 max-w-32 flex-shrink-0">
<img
src={imageState.previewUrl}
alt={isVideo ? 'Video thumbnail' : 'Selected image thumbnail'}
className="max-h-32 max-w-32 rounded-md object-cover shadow-sm"
/>
{isVideo && (
<div className="absolute inset-0 flex items-center justify-center">
<div className="rounded-full bg-black bg-opacity-70 p-2">
<svg
className="h-4 w-4 text-white"
fill="currentColor"
viewBox="0 0 20 20"
>
<path d="M6.3 2.841A1.5 1.5 0 004 4.11V15.89a1.5 1.5 0 002.3 1.269l9.344-5.89a1.5 1.5 0 000-2.538L6.3 2.84z" />
</svg>
</div>
</div>
)}
</div>
<div className="flex min-w-0 flex-1 flex-col justify-center gap-1">
<h3 className="text-sm font-medium text-green-700">Selected Image</h3>
<h3 className="text-sm font-medium text-green-700">
{isVideo ? 'Selected Video' : 'Selected Image'}
</h3>
<p
className="max-w-32 truncate text-xs text-green-600"
title={imageState.file.name}
>
{imageState.file.name}
</p>
{imageState.originalFileSize ? (
{isVideo ? (
<p className="text-xs text-slate-500">
{formatFileSize(imageState.file.size)} (thumbnail generated)
</p>
) : imageState.originalFileSize ? (
<div className="text-xs text-slate-500">
<div>Original: {formatFileSize(imageState.originalFileSize)}</div>
<div>Resized: {formatFileSize(imageState.file.size)}</div>
@@ -159,9 +237,9 @@ function ImagePreview({ imageState, onRemove }: ImagePreviewProps) {
onRemove();
}}
className="mt-1 self-start rounded bg-slate-600 px-2 py-1 text-xs font-medium text-slate-100 transition-colors hover:bg-red-600 focus:bg-red-600 focus:outline-none"
title="Clear image"
title={isVideo ? 'Clear video' : 'Clear image'}
>
Remove Image
{isVideo ? 'Remove Video' : 'Remove Image'}
</button>
</div>
</div>
@@ -191,10 +269,10 @@ function EmptyDropZone({ isMobile }: EmptyDropZoneProps) {
/>
</svg>
<p className="hidden font-medium text-slate-600 sm:block">
Drag and drop an image here
Drag and drop an image or video here
</p>
<p className="block font-medium text-slate-600 sm:hidden">
tap here to paste an image from the clipboard
tap here to paste an image or video from the clipboard
</p>
<p className="text-xs text-slate-500">or use one of the options below</p>
</div>
@@ -284,13 +362,15 @@ function FileUploadSection({
}: FileUploadSectionProps) {
return (
<div className="flex flex-1 flex-col items-center justify-center">
<h3 className="text-lg font-medium text-slate-500">Upload an image</h3>
<h3 className="text-lg font-medium text-slate-500">
Upload an image or video
</h3>
<div className="flex flex-col gap-1">
<label
htmlFor="image-file-input"
className="text-sm font-medium text-slate-700"
>
Choose an image file
Choose an image or video file
</label>
<input
ref={fileInputRef}
@@ -305,7 +385,7 @@ function FileUploadSection({
}}
/>
<p className="text-xs text-slate-500">
Supported formats: JPG, PNG, GIF, WebP
Supported formats: JPG, PNG, GIF, WebP, MP4
</p>
</div>
</div>
@@ -321,13 +401,15 @@ interface UrlUploadSectionProps {
function UrlUploadSection({ imageUrl, onUrlChange }: UrlUploadSectionProps) {
return (
<div className="flex flex-1 flex-col items-center justify-center">
<h3 className="text-lg font-medium text-slate-500">Provide image URL</h3>
<h3 className="text-lg font-medium text-slate-500">
Provide image or video URL
</h3>
<div className="flex flex-col gap-1">
<label
htmlFor="image-url-input"
className="text-sm font-medium text-slate-700"
>
Image URL
Image or Video URL
</label>
<input
id="image-url-input"
@@ -336,10 +418,10 @@ function UrlUploadSection({ imageUrl, onUrlChange }: UrlUploadSectionProps) {
value={imageUrl}
onChange={(e) => onUrlChange(e.target.value)}
className="w-full rounded-md border-slate-300 text-sm"
placeholder="https://example.com/image.jpg"
placeholder="https://example.com/image.jpg or https://example.com/video.mp4"
/>
<p className="text-xs text-slate-500">
Enter the direct URL to an image
Enter the direct URL to an image or video
</p>
</div>
</div>
@@ -460,7 +542,7 @@ export default function VisualSearchForm({
async (file: File) => {
if (!ACCEPTED_IMAGE_TYPES.includes(file.type)) {
showFeedback(
'Please select a valid image file (JPG, PNG, GIF, WebP)',
'Please select a valid image or video file (JPG, PNG, GIF, WebP, MP4)',
'error',
);
return;
@@ -471,34 +553,57 @@ export default function VisualSearchForm({
URL.revokeObjectURL(imageState.previewUrl);
}
// Show processing message for large files
// Show processing message for large files or videos
const originalSize = file.size;
const isLargeFile = originalSize > 2 * 1024 * 1024; // 2MB
const isVideo = file.type.startsWith('video/');
if (isLargeFile) {
showFeedback('Processing large image...', 'warning');
if (isLargeFile || isVideo) {
showFeedback(
isVideo
? 'Generating video thumbnail...'
: 'Processing large image...',
'warning',
);
}
try {
// Resize image if needed
const processedFile = await resizeImageIfNeeded(file);
let processedFile: File;
let thumbnailFile: File | undefined;
let previewUrl: string;
if (isVideo) {
// For video files, generate thumbnail for preview but keep original for upload
thumbnailFile = await generateVideoThumbnail(file);
processedFile = file; // Keep original video for upload
previewUrl = URL.createObjectURL(thumbnailFile);
// Set the original video file in the file input for form submission
const dataTransfer = new DataTransfer();
dataTransfer.items.add(file);
fileInputRef.current!.files = dataTransfer.files;
} else {
// For image files, process as before
processedFile = await resizeImageIfNeeded(file);
previewUrl = URL.createObjectURL(processedFile);
const dataTransfer = new DataTransfer();
dataTransfer.items.add(processedFile);
fileInputRef.current!.files = dataTransfer.files;
}
clearFeedback();
const dataTransfer = new DataTransfer();
dataTransfer.items.add(processedFile);
fileInputRef.current!.files = dataTransfer.files;
// Track original size if image was resized
const wasResized = processedFile.size < originalSize;
// Create preview URL for the thumbnail
const previewUrl = URL.createObjectURL(processedFile);
const wasResized = !isVideo && processedFile.size < originalSize;
// Set all image state at once
setImageState({
file: processedFile,
previewUrl,
originalFileSize: wasResized ? originalSize : null,
thumbnailFile,
isVideo,
});
// Visual feedback
@@ -506,7 +611,9 @@ export default function VisualSearchForm({
setTimeout(() => setIsDragOver(false), 1000);
} catch (error) {
showFeedback(
'Error processing image. Please try another file.',
isVideo
? 'Error processing video. Please try another file.'
: 'Error processing image. Please try another file.',
'error',
);
}
@@ -553,10 +660,13 @@ export default function VisualSearchForm({
const files = e.dataTransfer.files;
if (files.length > 0) {
const file = files[0];
if (file.type.match('image.*')) {
if (file.type.match('image.*') || file.type.match('video.*')) {
await handleImageFile(file);
} else {
showFeedback(`Please drop an image file (got ${file.type})`, 'error');
showFeedback(
`Please drop an image or video file (got ${file.type})`,
'error',
);
}
}
},
@@ -585,7 +695,7 @@ export default function VisualSearchForm({
}
showFeedback(
'No image found in clipboard. Copy an image first, then try again.',
'No image or video found in clipboard. Copy an image or video first, then try again.',
'warning',
);
return false;
@@ -621,6 +731,10 @@ export default function VisualSearchForm({
imageFile = item.getAsFile();
break;
}
if (item.type.indexOf('video') !== -1) {
imageFile = item.getAsFile();
break;
}
}
if (imageFile) {
@@ -629,7 +743,7 @@ export default function VisualSearchForm({
dragDropRef.current?.focus();
} else {
showFeedback(
'No image found in clipboard. Copy an image first, then paste here.',
'No image or video found in clipboard. Copy an image or video first, then paste here.',
'warning',
);
}

View File

@@ -0,0 +1,23 @@
/**
* Converts a byte count to a human-readable size string.
*
* @param bytes - The number of bytes to convert
* @param decimals - Number of decimal places to show (default: 1)
* @returns A human-readable size string (e.g., "1.2 KB", "3.4 MB")
*/
export function byteCountToHumanSize(
bytes: number,
decimals: number = 1,
): string {
if (bytes === 0) return '0 B';
if (bytes < 0) return '0 B';
const k = 1024;
const dm = decimals < 0 ? 0 : decimals;
const sizes = ['B', 'KB', 'MB', 'GB', 'TB', 'PB', 'EB', 'ZB', 'YB'];
const i = Math.floor(Math.log(bytes) / Math.log(k));
const size = parseFloat((bytes / Math.pow(k, i)).toFixed(dm));
return `${size} ${sizes[i]}`;
}

View File

@@ -1,9 +1,11 @@
# typed: strict
class Domain::Bluesky::Job::Base < Scraper::JobBase
abstract!
discard_on ActiveJob::DeserializationError
include HasBulkEnqueueJobs
queue_as :bluesky
discard_on ActiveJob::DeserializationError
sig { override.returns(Symbol) }
def self.http_factory_method
:get_generic_http_client
@@ -11,20 +13,65 @@ class Domain::Bluesky::Job::Base < Scraper::JobBase
protected
sig { params(user: Domain::User::BlueskyUser).void }
def enqueue_scan_posts_job_if_due(user)
if user.posts_scan.due? || force_scan?
logger.info(
format_tags(
"enqueue posts scan",
make_tags(posts_scan: user.posts_scan.ago_in_words),
),
)
defer_job(Domain::Bluesky::Job::ScanPostsJob, { user: })
else
logger.info(
format_tags(
"skipping enqueue of posts scan",
make_tags(scanned_at: user.posts_scan.ago_in_words),
),
)
end
end
sig { params(user: Domain::User::BlueskyUser).void }
def enqueue_scan_user_job_if_due(user)
if user.profile_scan.due? || force_scan?
logger.info(
format_tags(
"enqueue user scan",
make_tags(profile_scan: user.profile_scan.ago_in_words),
),
)
defer_job(Domain::Bluesky::Job::ScanUserJob, { user: })
else
logger.info(
format_tags(
"skipping enqueue of user scan",
make_tags(scanned_at: user.profile_scan.ago_in_words),
),
)
end
end
sig { returns(T.nilable(Domain::User::BlueskyUser)) }
def user_from_args
if (user = arguments[0][:user]).is_a?(Domain::User::BlueskyUser)
user
elsif (did = arguments[0][:did]).present?
Domain::User::BlueskyUser.find_or_initialize_by(did: did)
Domain::User::BlueskyUser.find_or_create_by(did:) do |user|
resolver = DIDKit::Resolver.new
if (resolved = resolver.resolve_did(did))
user.handle = resolved.get_validated_handle
end
end
elsif (handle = arguments[0][:handle]).present?
resolver = DIDKit::Resolver.new
resolved =
resolver.resolve_handle(handle) ||
fatal_error("failed to resolve handle: #{handle}")
Domain::User::BlueskyUser.find_or_initialize_by(
did: resolved.did,
) { |user| user.handle = handle }
did = resolver.resolve_handle(handle)&.did
fatal_error("failed to resolve handle: #{handle}") if did.nil?
user = Domain::User::BlueskyUser.find_or_initialize_by(did:)
user.handle = handle
user.save!
user
else
nil
end

View File

@@ -0,0 +1,258 @@
# typed: strict
class Domain::Bluesky::Job::ScanPostsJob < Domain::Bluesky::Job::Base
MEDIA_EMBED_TYPES = %w[app.bsky.embed.images app.bsky.embed.video]
self.default_priority = -10
sig { override.params(args: T::Hash[Symbol, T.untyped]).returns(T.untyped) }
def perform(args)
user = user_from_args!
logger.push_tags(make_arg_tag(user))
logger.info(format_tags("starting posts scan"))
return if buggy_user?(user)
unless user.state_ok?
logger.error(
format_tags("skipping posts scan", make_tags(state: user.state)),
)
return
end
if !user.posts_scan.due? && !force_scan?
logger.info(
format_tags(
"skipping posts scan",
make_tags(scanned_at: user.posts_scan.ago_in_words),
),
)
return
end
scan_user_posts(user)
user.last_posts_scan_log_entry = first_log_entry
user.touch
logger.info(format_tags("completed posts scan"))
ensure
user.save! if user
end
private
sig do
params(
user: Domain::User::BlueskyUser,
record_data: T::Hash[String, T.untyped],
).returns(T::Boolean)
end
def should_record_post?(user, record_data)
# Check for quotes first - skip quotes of other users' posts
quote_uri = extract_quote_uri(record_data)
if quote_uri
# Extract DID from the quoted post URI
quoted_did = quote_uri.split("/")[2]
return false unless quoted_did == user.did
end
# Check for replies - only record if it's a root post or reply to user's own post
return true unless record_data.dig("value", "reply")
# For replies, check if the root post is by the same user
reply_data = record_data.dig("value", "reply")
root_uri = reply_data.dig("root", "uri")
return true unless root_uri # If we can't determine root, allow it
# Extract DID from the root post URI
# AT URI format: at://did:plc:xyz/app.bsky.feed.post/rkey
root_did = root_uri.split("/")[2]
# Only record if the root post is by the same user
root_did == user.did
end
sig { params(record: T::Hash[String, T.untyped]).returns(T.nilable(String)) }
def extract_quote_uri(record)
# Check for quote in embed data
embed = record["embed"]
return nil unless embed
case embed["$type"]
when "app.bsky.embed.record"
# Direct quote - check if it's actually a quote of a post
record_data = embed["record"]
if record_data && record_data["uri"]&.include?("app.bsky.feed.post")
record_data["uri"]
end
when "app.bsky.embed.recordWithMedia"
# Quote with media
record_data = embed.dig("record", "record")
if record_data && record_data["uri"]&.include?("app.bsky.feed.post")
record_data["uri"]
end
else
nil
end
end
sig { params(user: Domain::User::BlueskyUser).void }
def scan_user_posts(user)
# Use AT Protocol API to list user's posts
posts_url =
"https://bsky.social/xrpc/com.atproto.repo.listRecords?repo=#{user.did}&collection=app.bsky.feed.post&limit=100"
cursor = T.let(nil, T.nilable(String))
num_processed_posts = 0
num_posts_with_media = 0
num_filtered_posts = 0
num_created_posts = 0
num_pages = 0
posts_scan = Domain::UserJobEvent::PostsScan.create!(user:)
loop do
url = cursor ? "#{posts_url}&cursor=#{cursor}" : posts_url
response = http_client.get(url)
posts_scan.update!(log_entry: response.log_entry) if num_pages == 0
num_pages += 1
if response.status_code == 400
error = JSON.parse(response.body)["error"]
if error == "InvalidRequest"
logger.error(format_tags("account is disabled / does not exist"))
user.state = "account_disabled"
return
end
end
if response.status_code != 200
fatal_error(
format_tags(
"failed to get user posts",
make_tags(status_code: response.status_code),
),
)
end
begin
data = JSON.parse(response.body)
if data["error"]
logger.error(
format_tags("posts API error", make_tags(error: data["error"])),
)
break
end
records = data["records"] || []
records.each do |record_data|
num_processed_posts += 1
embed_type = record_data.dig("value", "embed", "$type")
unless MEDIA_EMBED_TYPES.include?(embed_type)
logger.info(
format_tags(
"skipping post, non-media embed type",
make_tags(embed_type:),
),
)
next
end
# Only process posts with media
num_posts_with_media += 1
# Skip posts that are replies to other users or quotes
unless should_record_post?(user, record_data)
num_filtered_posts += 1
next
end
if process_historical_post(user, record_data, response.log_entry)
num_created_posts += 1
end
end
cursor = data["cursor"]
break if cursor.nil? || records.empty?
rescue JSON::ParserError => e
logger.error(
format_tags(
"failed to parse posts JSON",
make_tags(error: e.message),
),
)
break
end
end
user.scanned_posts_at = Time.current
posts_scan.update!(
total_posts_seen: num_processed_posts,
new_posts_seen: num_created_posts,
)
logger.info(
format_tags(
"scanned posts",
make_tags(
num_processed_posts:,
num_posts_with_media:,
num_filtered_posts:,
num_created_posts:,
num_pages:,
),
),
)
end
sig do
params(
user: Domain::User::BlueskyUser,
record_data: T::Hash[String, T.untyped],
log_entry: HttpLogEntry,
).returns(T::Boolean)
end
def process_historical_post(user, record_data, log_entry)
at_uri = record_data["uri"]
# Check if we already have this post
existing_post = user.posts.find_by(at_uri:)
if existing_post
enqueue_pending_files_job(existing_post)
return false
end
# Extract reply and quote URIs from the raw post data
reply_to_uri = record_data.dig("value", "reply", "root", "uri")
quote_uri = extract_quote_uri(record_data)
post =
Domain::Post::BlueskyPost.build(
state: "ok",
at_uri: at_uri,
first_seen_entry: log_entry,
text: record_data.dig("value", "text") || "",
posted_at: Time.parse(record_data.dig("value", "createdAt")),
post_raw: record_data.dig("value"),
reply_to_uri: reply_to_uri,
quote_uri: quote_uri,
)
post.creator = user
post.save!
# Process media if present
embed = record_data.dig("value", "embed")
helper = Bluesky::ProcessPostHelper.new(@deferred_job_sink)
helper.process_post_media(post, embed, user.did!) if embed
logger.debug(format_tags("created post", make_tags(at_uri:)))
true
end
sig { params(post: Domain::Post::BlueskyPost).void }
def enqueue_pending_files_job(post)
post.files.each do |post_file|
if post_file.state_pending?
defer_job(Domain::StaticFileJob, { post_file: }, { queue: "bluesky" })
end
end
end
end

View File

@@ -0,0 +1,266 @@
# typed: strict
# frozen_string_literal: true
class Domain::Bluesky::Job::ScanUserFollowsJob < Domain::Bluesky::Job::Base
self.default_priority = -10
sig { override.params(args: T::Hash[Symbol, T.untyped]).returns(T.untyped) }
def perform(args)
user = user_from_args!
last_follows_scan = user.follows_scans.where(state: "completed").last
if (ca = last_follows_scan&.created_at) && (ca > 1.month.ago) &&
!force_scan?
logger.info(
format_tags(
"skipping user #{user.did} follows scan",
make_tags(
ago: time_ago_in_words(ca),
last_scan_id: last_follows_scan.id,
),
),
)
else
perform_scan_type(
user,
"follows",
bsky_method: "app.bsky.graph.getFollows",
bsky_field: "follows",
edge_name: :user_user_follows_from,
user_attr: :from_id,
other_attr: :to_id,
)
end
last_followed_by_scan =
user.followed_by_scans.where(state: "completed").last
if (ca = last_followed_by_scan&.created_at) && (ca > 1.month.ago) &&
!force_scan?
logger.info(
format_tags(
"skipping user #{user.did} followed by scan",
make_tags(
ago: time_ago_in_words(ca),
last_scan_id: last_followed_by_scan.id,
),
),
)
else
perform_scan_type(
user,
"followed_by",
bsky_method: "app.bsky.graph.getFollowers",
bsky_field: "followers",
edge_name: :user_user_follows_to,
user_attr: :to_id,
other_attr: :from_id,
)
end
end
private
sig do
params(
user: Domain::User::BlueskyUser,
kind: String,
bsky_method: String,
bsky_field: String,
edge_name: Symbol,
user_attr: Symbol,
other_attr: Symbol,
).void
end
def perform_scan_type(
user,
kind,
bsky_method:,
bsky_field:,
edge_name:,
user_attr:,
other_attr:
)
scan = Domain::UserJobEvent::FollowScan.create!(user:, kind:)
cursor = T.let(nil, T.nilable(String))
page = 0
subjects_data = T.let([], T::Array[Bluesky::Graph::Subject])
loop do
# get followers
xrpc_url =
"https://public.api.bsky.app/xrpc/#{bsky_method}?actor=#{user.did!}&limit=100"
xrpc_url = "#{xrpc_url}&cursor=#{cursor}" if cursor
response = http_client.get(xrpc_url)
scan.update!(log_entry: response.log_entry) if page == 0
page += 1
if response.status_code != 200
fatal_error(
format_tags(
"failed to get user #{kind}",
make_tags(status_code: response.status_code),
),
)
end
data = JSON.parse(response.body)
if data["error"]
fatal_error(
format_tags(
"failed to get user #{kind}",
make_tags(error: data["error"]),
),
)
end
subjects_data.concat(
data[bsky_field].map do |subject_data|
Bluesky::Graph::Subject.from_json(subject_data)
end,
)
cursor = data["cursor"]
break if cursor.nil?
end
handle_subjects_data(
user,
subjects_data,
scan,
edge_name:,
user_attr:,
other_attr:,
)
scan.update!(state: "completed", completed_at: Time.current)
logger.info(
format_tags(
"completed user #{kind} scan",
make_tags(num_subjects: subjects_data.size),
),
)
rescue => e
scan.update!(state: "error", completed_at: Time.current) if scan
raise e
end
sig do
params(
user: Domain::User::BlueskyUser,
subjects: T::Array[Bluesky::Graph::Subject],
scan: Domain::UserJobEvent::FollowScan,
edge_name: Symbol,
user_attr: Symbol,
other_attr: Symbol,
).void
end
def handle_subjects_data(
user,
subjects,
scan,
edge_name:,
user_attr:,
other_attr:
)
subjects_by_did =
T.cast(subjects.index_by(&:did), T::Hash[String, Bluesky::Graph::Subject])
users_by_did =
T.cast(
Domain::User::BlueskyUser.where(did: subjects_by_did.keys).index_by(
&:did
),
T::Hash[String, Domain::User::BlueskyUser],
)
missing_user_dids = subjects_by_did.keys - users_by_did.keys
missing_user_dids.each do |did|
subject = subjects_by_did[did] || next
users_by_did[did] = create_user_from_subject(subject)
end
users_by_id = users_by_did.values.map { |u| [T.must(u.id), u] }.to_h
existing_subject_ids =
T.cast(user.send(edge_name).pluck(other_attr), T::Array[Integer])
new_user_ids = users_by_did.values.map(&:id).compact - existing_subject_ids
removed_user_ids =
existing_subject_ids - users_by_did.values.map(&:id).compact
follow_upsert_attrs = []
unfollow_upsert_attrs = []
referenced_user_ids = Set.new([user.id])
new_user_ids.each do |new_user_id|
new_user_did = users_by_id[new_user_id]&.did
followed_at = new_user_did && subjects_by_did[new_user_did]&.created_at
referenced_user_ids.add(new_user_id)
follow_upsert_attrs << {
user_attr => user.id,
other_attr => new_user_id,
:followed_at => followed_at,
:removed_at => nil,
}
end
removed_at = Time.current
removed_user_ids.each do |removed_user_id|
referenced_user_ids.add(removed_user_id)
unfollow_upsert_attrs << {
user_attr => user.id,
other_attr => removed_user_id,
:removed_at => removed_at,
}
end
Domain::User.transaction do
follow_upsert_attrs.each_slice(5000) do |slice|
Domain::UserUserFollow.upsert_all(slice, unique_by: %i[from_id to_id])
end
unfollow_upsert_attrs.each_slice(5000) do |slice|
Domain::UserUserFollow.upsert_all(slice, unique_by: %i[from_id to_id])
end
end
# reset counter caches
Domain::User.transaction do
referenced_user_ids.each do |user_id|
Domain::User.reset_counters(
user_id,
:user_user_follows_from,
:user_user_follows_to,
)
end
end
update_attrs = {
num_created_users: missing_user_dids.size,
num_existing_assocs: existing_subject_ids.size,
num_new_assocs: new_user_ids.size,
num_removed_assocs: removed_user_ids.size,
num_total_assocs: subjects.size,
}
logger.info(
format_tags("updated user #{edge_name}", make_tags(update_attrs)),
)
scan.update_json_attributes!(update_attrs)
user.touch
end
sig do
params(subject: Bluesky::Graph::Subject).returns(Domain::User::BlueskyUser)
end
def create_user_from_subject(subject)
user =
Domain::User::BlueskyUser.create!(
did: subject.did,
handle: subject.handle,
display_name: subject.display_name,
description: subject.description,
)
avatar = user.create_avatar(url_str: subject.avatar)
defer_job(Domain::Bluesky::Job::ScanUserJob, { user: }, { priority: 0 })
defer_job(Domain::UserAvatarJob, { avatar: }, { priority: -1 })
user
end
end

View File

@@ -1,271 +1,110 @@
# typed: strict
class Domain::Bluesky::Job::ScanUserJob < Domain::Bluesky::Job::Base
self.default_priority = -30
self.default_priority = -20
sig { override.params(args: T::Hash[Symbol, T.untyped]).returns(T.untyped) }
def perform(args)
user = user_from_args!
logger.push_tags(make_arg_tag(user))
logger.info("Starting Bluesky user scan for #{user.handle}")
logger.info(format_tags("starting profile scan"))
return if buggy_user?(user)
# Scan user profile/bio
user = scan_user_profile(user) if force_scan? ||
user.scanned_profile_at.nil? || due_for_profile_scan?(user)
# Scan user's historical posts
if user.state_ok? &&
(
force_scan? || user.scanned_posts_at.nil? ||
due_for_posts_scan?(user)
)
scan_user_posts(user)
if user.state_account_disabled? && !force_scan?
logger.info(format_tags("account is disabled, skipping profile scan"))
return
end
logger.info("Completed Bluesky user scan")
if !user.profile_scan.due? && !force_scan?
logger.info(
format_tags(
"skipping profile scan",
make_tags(scanned_at: user.profile_scan.ago_in_words),
),
)
return
end
scan_user_profile(user)
user.scanned_profile_at = Time.zone.now
logger.info(format_tags("completed profile scan"))
ensure
user.save! if user
end
private
sig do
params(user: Domain::User::BlueskyUser).returns(Domain::User::BlueskyUser)
end
sig { params(user: Domain::User::BlueskyUser).void }
def scan_user_profile(user)
logger.info("Scanning user profile for #{user.handle}")
logger.info(format_tags("scanning user profile"))
profile_scan = Domain::UserJobEvent::ProfileScan.create!(user:)
# Use AT Protocol API to get user profile
# Use Bluesky Actor API to get user profile
profile_url =
"https://bsky.social/xrpc/com.atproto.repo.getRecord?repo=#{user.did}&collection=app.bsky.actor.profile&rkey=self"
"https://public.api.bsky.app/xrpc/app.bsky.actor.getProfile?actor=#{user.did}"
response = http_client.get(profile_url)
if response.status_code != 200
logger.error("Failed to get user profile: #{response.status_code}")
user.state_error!
return user
user.last_scan_log_entry = response.log_entry
profile_scan.update!(log_entry: response.log_entry)
if response.status_code == 400
error = JSON.parse(response.body)["error"]
if error == "InvalidRequest"
logger.error(format_tags("account is disabled / does not exist"))
user.state = "account_disabled"
return
end
end
# Note: Store log entry reference if needed for debugging
if response.status_code != 200
fatal_error(
format_tags(
"failed to get user profile",
make_tags(status_code: response.status_code),
),
)
end
begin
profile_data = JSON.parse(response.body)
if profile_data["error"]
logger.error("Profile API error: #{profile_data["error"]}")
user.state_error!
return user
end
record = profile_data["value"]
if record
# Update user profile information
user.description = record["description"]
user.display_name = record["displayName"]
user.profile_raw = record
# Process avatar if present
if record["avatar"] && record["avatar"]["ref"]
process_user_avatar(user, record["avatar"])
end
end
user.scanned_profile_at = Time.current
user.state_ok! unless user.state_error?
rescue JSON::ParserError => e
logger.error("Failed to parse profile JSON: #{e.message}")
user.state_error!
end
user
end
sig { params(user: Domain::User::BlueskyUser).void }
def scan_user_posts(user)
logger.info("Scanning historical posts for #{user.handle}")
# Use AT Protocol API to list user's posts
posts_url =
"https://bsky.social/xrpc/com.atproto.repo.listRecords?repo=#{user.did}&collection=app.bsky.feed.post&limit=100"
cursor = T.let(nil, T.nilable(String))
posts_processed = 0
posts_with_media = 0
loop do
url = cursor ? "#{posts_url}&cursor=#{cursor}" : posts_url
response = http_client.get(url)
if response.status_code != 200
logger.error("Failed to get user posts: #{response.status_code}")
break
end
begin
data = JSON.parse(response.body)
if data["error"]
logger.error("Posts API error: #{data["error"]}")
break
end
records = data["records"] || []
records.each do |record_data|
posts_processed += 1
record = record_data["value"]
next unless record && record["embed"]
# Only process posts with media
posts_with_media += 1
user_did = user.did
next unless user_did
process_historical_post(user, record_data, record, user_did)
end
cursor = data["cursor"]
break if cursor.nil? || records.empty?
# Add small delay to avoid rate limiting
sleep(0.1)
rescue JSON::ParserError => e
logger.error("Failed to parse posts JSON: #{e.message}")
break
end
end
user.scanned_posts_at = Time.current
logger.info(
"Processed #{posts_processed} posts, #{posts_with_media} with media",
)
end
sig do
params(
user: Domain::User::BlueskyUser,
record_data: T::Hash[String, T.untyped],
record: T::Hash[String, T.untyped],
user_did: String,
).void
end
def process_historical_post(user, record_data, record, user_did)
uri = record_data["uri"]
rkey = record_data["uri"].split("/").last
# Check if we already have this post
existing_post = Domain::Post::BlueskyPost.find_by(at_uri: uri)
return if existing_post
begin
post =
Domain::Post::BlueskyPost.create!(
at_uri: uri,
bluesky_rkey: rkey,
text: record["text"] || "",
bluesky_created_at: Time.parse(record["createdAt"]),
post_raw: record,
)
post.creator = user
post.save!
# Process media if present
process_post_media(post, record["embed"], user_did) if record["embed"]
logger.debug("Created historical post: #{post.bluesky_rkey}")
rescue => e
logger.error("Failed to create historical post #{rkey}: #{e.message}")
end
end
sig do
params(
post: Domain::Post::BlueskyPost,
embed_data: T::Hash[String, T.untyped],
did: String,
).void
end
def process_post_media(post, embed_data, did)
case embed_data["$type"]
when "app.bsky.embed.images"
process_post_images(post, embed_data["images"], did)
when "app.bsky.embed.recordWithMedia"
if embed_data["media"] &&
embed_data["media"]["$type"] == "app.bsky.embed.images"
process_post_images(post, embed_data["media"]["images"], did)
end
when "app.bsky.embed.external"
process_external_embed(post, embed_data["external"], did)
end
end
sig do
params(
post: Domain::Post::BlueskyPost,
images: T::Array[T::Hash[String, T.untyped]],
did: String,
).void
end
def process_post_images(post, images, did)
files = []
images.each_with_index do |image_data, index|
blob_data = image_data["image"]
next unless blob_data && blob_data["ref"]
post_file =
post.files.build(
type: "Domain::PostFile::BlueskyPostFile",
file_order: index,
url_str: construct_blob_url(did, blob_data["ref"]["$link"]),
state: "pending",
alt_text: image_data["alt"],
blob_ref: blob_data["ref"]["$link"],
)
# Store aspect ratio if present
if image_data["aspectRatio"]
post_file.aspect_ratio_width = image_data["aspectRatio"]["width"]
post_file.aspect_ratio_height = image_data["aspectRatio"]["height"]
end
post_file.save!
Domain::StaticFileJob.perform_later({ post_file: })
files << post_file
end
logger.debug(
"Created #{files.size} #{"file".pluralize(files.size)} for historical post: #{post.bluesky_rkey}",
)
end
sig do
params(
post: Domain::Post::BlueskyPost,
external_data: T::Hash[String, T.untyped],
did: String,
).void
end
def process_external_embed(post, external_data, did)
thumb_data = external_data["thumb"]
return unless thumb_data && thumb_data["ref"]
post_file =
post.files.build(
type: "Domain::PostFile::BlueskyPostFile",
file_order: 0,
url_str: construct_blob_url(did, thumb_data["ref"]["$link"]),
state: "pending",
blob_ref: thumb_data["ref"]["$link"],
fatal_error(
format_tags(
"failed to parse profile JSON",
make_tags(error: e.message),
),
)
end
post_file.save!
Domain::StaticFileJob.perform_later({ post_file: })
if profile_data["error"]
fatal_error(
format_tags(
"profile API error",
make_tags(error: profile_data["error"]),
),
)
end
logger.debug(
"Created external thumbnail for historical post: #{post.bluesky_rkey}",
)
# The getProfile endpoint returns the profile data directly, not wrapped in "value"
record = profile_data
if record
# Update user profile information
user.description = record["description"]
user.display_name = record["displayName"]
user.profile_raw = record
# Set registration time from profile createdAt
if record["createdAt"]
user.registered_at = Time.parse(record["createdAt"]).in_time_zone("UTC")
logger.info(
format_tags(
"set user registration time",
make_tags(registered_at: user.registered_at),
),
)
end
# Process avatar if present
process_user_avatar_url(user, record["avatar"]) if record["avatar"]
end
end
sig do
@@ -275,36 +114,119 @@ class Domain::Bluesky::Job::ScanUserJob < Domain::Bluesky::Job::Base
).void
end
def process_user_avatar(user, avatar_data)
return if user.avatar.present?
logger.debug(format_tags("processing user avatar", make_tags(avatar_data:)))
return unless avatar_data["ref"]
user_did = user.did
return unless user_did
user.create_avatar!(
url_str: construct_blob_url(user_did, avatar_data["ref"]["$link"]),
avatar_url =
Bluesky::ProcessPostHelper.construct_blob_url(
user_did,
avatar_data["ref"]["$link"],
)
logger.debug(format_tags("extract avatar url", make_tags(avatar_url:)))
# Check if avatar already exists and is downloaded
existing_avatar = user.avatar
if existing_avatar.present?
logger.debug(
format_tags(
"existing avatar found",
make_tags(state: existing_avatar.state),
),
)
# Only enqueue if the avatar URL has changed or it's not downloaded yet
if existing_avatar.url_str != avatar_url
avatar = user.avatars.create!(url_str: avatar_url)
logger.info(
format_tags(
"avatar url changed, creating new avatar",
make_arg_tag(avatar),
),
)
defer_job(
Domain::UserAvatarJob,
{ avatar: avatar },
{ queue: "bluesky", priority: -30 },
)
elsif existing_avatar.state_pending?
defer_job(
Domain::UserAvatarJob,
{ avatar: existing_avatar },
{ queue: "bluesky", priority: -30 },
)
logger.info(format_tags("re-enqueued pending avatar download"))
end
else
# Create new avatar and enqueue download
avatar = user.avatars.create!(url_str: avatar_url)
defer_job(
Domain::UserAvatarJob,
{ avatar: },
{ queue: "bluesky", priority: -30 },
)
logger.info(
format_tags(
"created avatar and enqueued download",
make_arg_tag(avatar),
),
)
end
end
sig { params(user: Domain::User::BlueskyUser, avatar_url: String).void }
def process_user_avatar_url(user, avatar_url)
logger.debug(
format_tags("processing user avatar url", make_tags(avatar_url:)),
)
return if avatar_url.blank?
# Enqueue avatar download job if we had one
logger.debug("Created avatar for user: #{user.handle}")
end
sig { params(did: String, cid: String).returns(String) }
def construct_blob_url(did, cid)
"https://bsky.social/xrpc/com.atproto.sync.getBlob?did=#{did}&cid=#{cid}"
end
sig { params(user: Domain::User::BlueskyUser).returns(T::Boolean) }
def due_for_profile_scan?(user)
scanned_at = user.scanned_profile_at
return true if scanned_at.nil?
scanned_at < 1.month.ago
end
sig { params(user: Domain::User::BlueskyUser).returns(T::Boolean) }
def due_for_posts_scan?(user)
scanned_at = user.scanned_posts_at
return true if scanned_at.nil?
scanned_at < 1.week.ago
# Check if avatar already exists and is downloaded
existing_avatar = user.avatar
if existing_avatar.present?
logger.debug(
format_tags(
"existing avatar found",
make_tags(state: existing_avatar.state),
),
)
# Only enqueue if the avatar URL has changed or it's not downloaded yet
if existing_avatar.url_str != avatar_url
avatar = user.avatars.create!(url_str: avatar_url)
logger.info(
format_tags(
"avatar url changed, creating new avatar",
make_arg_tag(avatar),
),
)
defer_job(
Domain::UserAvatarJob,
{ avatar: avatar },
{ queue: "bluesky", priority: -30 },
)
elsif existing_avatar.state_pending?
defer_job(
Domain::UserAvatarJob,
{ avatar: existing_avatar },
{ queue: "bluesky", priority: -30 },
)
logger.info(format_tags("re-enqueued pending avatar download"))
end
else
# Create new avatar and enqueue download
avatar = user.avatars.create!(url_str: avatar_url)
defer_job(
Domain::UserAvatarJob,
{ avatar: },
{ queue: "bluesky", priority: -30 },
)
logger.info(
format_tags(
"created avatar and enqueued download",
make_arg_tag(avatar),
),
)
end
end
end

View File

@@ -121,9 +121,11 @@ class Domain::E621::Job::ScanUserFavsJob < Domain::E621::Job::Base
logger.info "upserting #{post_ids.size} favs"
post_ids.each_slice(1000) do |slice|
ReduxApplicationRecord.transaction do
Domain::UserPostFav.upsert_all(
slice.map { |post_id| { user_id: user.id, post_id: post_id } },
unique_by: :index_domain_user_post_favs_on_user_id_and_post_id,
Domain::UserPostFav::E621UserPostFav.upsert_all(
slice.map do |post_id|
{ user_id: user.id, post_id: post_id, removed: false }
end,
unique_by: %i[user_id post_id],
)
end
end

View File

@@ -11,7 +11,8 @@ class Domain::Fa::Job::Base < Scraper::JobBase
protected
BUGGY_USER_URL_NAMES = T.let(["click here", "..", "."], T::Array[String])
BUGGY_USER_URL_NAMES =
T.let(["click here", "..", ".", "<i class="], T::Array[String])
sig { params(user: Domain::User::FaUser).returns(T::Boolean) }
def buggy_user?(user)

View File

@@ -2,6 +2,7 @@
class Domain::PostFileThumbnailJob < Scraper::JobBase
queue_as :thumbnails
discard_on Vips::Error
retry_on LoadedMedia::FileNotFound
sig { override.returns(Symbol) }
def self.http_factory_method

View File

@@ -2,6 +2,7 @@
class Domain::StaticFileJob < Scraper::JobBase
include Domain::StaticFileJobHelper
queue_as :static_file
discard_on ActiveJob::DeserializationError
sig { override.returns(Symbol) }
def self.http_factory_method

View File

@@ -105,7 +105,9 @@ module Domain::StaticFileJobHelper
end
end
ensure
post_file.save! if post_file
post_file.save!
post = post_file.post
post.touch if post
if should_enqueue_thumbnail_job
defer_job(Domain::PostFileThumbnailJob, { post_file: })
end

View File

@@ -1,7 +1,14 @@
# typed: strict
# frozen_string_literal: true
class Domain::UserAvatarJob < Scraper::JobBase
abstract!
queue_as :static_file
discard_on ActiveJob::DeserializationError
sig { override.returns(Symbol) }
def self.http_factory_method
:get_generic_http_client
end
sig { override.params(args: T::Hash[Symbol, T.untyped]).returns(T.untyped) }
def perform(args)
@@ -19,6 +26,9 @@ class Domain::UserAvatarJob < Scraper::JobBase
self.first_log_entry ||= response.log_entry
avatar.last_log_entry = response.log_entry
return if check_bluesky_force_rescan?(response, avatar)
return if check_bluesky_terminal_error?(response, avatar)
case response.status_code
when 200
avatar.state = "ok"
@@ -39,6 +49,55 @@ class Domain::UserAvatarJob < Scraper::JobBase
end
end
ensure
avatar.save! if avatar
if avatar
avatar.save!
user = avatar.user
user.touch if user
end
end
sig do
params(
response: Scraper::HttpClient::Response,
avatar: Domain::UserAvatar,
).returns(T::Boolean)
end
def check_bluesky_force_rescan?(response, avatar)
return false unless response.status_code == 400
unless avatar.url_str&.starts_with?(
"https://bsky.social/xrpc/com.atproto.sync.getBlob",
)
return false
end
data = JSON.parse(response.body)
# not the right error from bsky
return false unless data["error"] == "RepoDeactivated"
# already enqueued force rescan of user
return false if avatar.error_message == "RepoDeactivated"
logger.warn(format_tags("bsky blob 400, force rescan user"))
avatar.state = "http_error"
avatar.error_message = "RepoDeactivated"
avatar.save!
Domain::Bluesky::Job::ScanUserJob.perform_later(
user: avatar.user,
force_scan: true,
)
true
end
sig do
params(
response: Scraper::HttpClient::Response,
avatar: Domain::UserAvatar,
).returns(T::Boolean)
end
def check_bluesky_terminal_error?(response, avatar)
return false unless [422, 500, 504].include?(response.status_code)
return false unless avatar.url_str&.starts_with?("https://cdn.bsky.app")
return true
end
end

View File

@@ -70,8 +70,7 @@ class Scraper::JobBase < ApplicationJob
sig { params(args: T.untyped).void }
def initialize(*args)
super(*T.unsafe(args))
@deferred_jobs = T.let(Set.new, T::Set[DeferredJob])
@suppressed_jobs = T.let(Set.new, T::Set[SuppressedJob])
@deferred_job_sink = T.let(DeferredJobSink.new(self.class), DeferredJobSink)
@http_client = T.let(nil, T.nilable(Scraper::HttpClient))
@tor_http_client = T.let(nil, T.nilable(Scraper::HttpClient))
@gallery_dl_client = T.let(nil, T.nilable(Scraper::GalleryDlClient))
@@ -122,7 +121,13 @@ class Scraper::JobBase < ApplicationJob
sig { returns(Domain::UserAvatar) }
def avatar_from_args!
T.cast(arguments[0][:avatar], Domain::UserAvatar)
if (avatar = arguments[0][:avatar])
T.cast(avatar, Domain::UserAvatar)
elsif (user = arguments[0][:user])
T.must(T.cast(user, Domain::User).avatar)
else
raise("no avatar found in arguments: #{arguments.inspect}")
end
end
sig { returns(Domain::PostFile) }
@@ -250,78 +255,12 @@ class Scraper::JobBase < ApplicationJob
)
end
sig do
params(
job_class: T.class_of(Scraper::JobBase),
params: T::Hash[Symbol, T.untyped],
set_args: T::Hash[Symbol, T.untyped],
).returns(T::Boolean)
end
def defer_job(job_class, params, set_args = {})
!!@deferred_jobs.add?(DeferredJob.new(job_class:, params:, set_args:))
end
sig do
params(
job_class: T.class_of(Scraper::JobBase),
params: T::Hash[Symbol, T.untyped],
).void
end
def suppress_deferred_job(job_class, params)
ignore_args = job_class.gather_ignore_signature_args
params_cleared =
params.reject { |key, value| ignore_args.include?(key.to_sym) }
!!@suppressed_jobs.add?(
SuppressedJob.new(job_class:, params: params_cleared),
)
end
delegate :defer_job, to: :@deferred_job_sink
delegate :suppress_deferred_job, to: :@deferred_job_sink
sig { void }
def enqueue_deferred_jobs!
jobs_to_enqueue =
@deferred_jobs.filter_map do |deferred_job|
if @suppressed_jobs.any? { |suppressed_job|
if suppressed_job.matches?(deferred_job)
logger.info(
"suppressing deferred job #{deferred_job.job_class.name} with params #{deferred_job.describe_params}",
)
true
end
}
nil
else
deferred_job
end
end
GoodJob::Bulk.enqueue do
jobs_to_enqueue.each do |deferred_job|
args =
deferred_job.params.merge(
{
caused_by_entry: causing_log_entry,
caused_by_job_id: self.job_id,
},
)
set_args = deferred_job.set_args
job = deferred_job.job_class.set(set_args).perform_later(args)
Scraper::Metrics::JobBaseMetrics.observe_job_enqueued(
source_class: self.class,
enqueued_class: deferred_job.job_class,
)
if job
logger.info(
format_tags(
make_tag("job_class", deferred_job.job_class.name),
(make_tag("job_id", job.job_id)),
"enqueue deferred job",
),
)
end
end
rescue StandardError => e
logger.error("error enqueueing jobs: #{e.class.name} - #{e.message}")
end
@deferred_job_sink.enqueue_deferred_jobs!(causing_log_entry, self.job_id)
end
sig { params(msg: T.untyped).returns(T.noreturn) }

5
app/lib/bluesky.rb Normal file
View File

@@ -0,0 +1,5 @@
# typed: strict
# frozen_string_literal: true
module Bluesky
end

7
app/lib/bluesky/graph.rb Normal file
View File

@@ -0,0 +1,7 @@
# typed: strict
# frozen_string_literal: true
module Bluesky
module Graph
end
end

View File

@@ -0,0 +1,31 @@
# typed: strict
# frozen_string_literal: true
module Bluesky
module Graph
class Subject < T::ImmutableStruct
extend T::Sig
include T::Struct::ActsAsComparable
const :did, String
const :handle, String
const :display_name, T.nilable(String)
const :description, T.nilable(String)
const :avatar, T.nilable(String)
const :indexed_at, T.nilable(Time)
const :created_at, T.nilable(Time)
sig { params(json: T::Hash[String, T.untyped]).returns(Subject) }
def self.from_json(json)
new(
did: json["did"],
handle: json["handle"],
display_name: json["displayName"],
description: json["description"],
avatar: json["avatar"],
indexed_at: (ia = json["indexedAt"]) && Time.zone.parse(ia),
created_at: (ca = json["createdAt"]) && Time.zone.parse(ca),
)
end
end
end
end

View File

@@ -0,0 +1,150 @@
# typed: strict
# frozen_string_literal: true
#
class Bluesky::ProcessPostHelper
extend T::Sig
include HasColorLogger
IMAGE_OR_VIDEO =
T.let(%w[app.bsky.embed.images app.bsky.embed.video], T::Array[String])
sig { params(deferred_job_sink: DeferredJobSink).void }
def initialize(deferred_job_sink)
@deferred_job_sink = deferred_job_sink
end
sig { params(did: String, cid: String).returns(String) }
def self.construct_blob_url(did, cid)
"https://bsky.social/xrpc/com.atproto.sync.getBlob?did=#{did}&cid=#{cid}"
end
sig { params(embed_data: T::Hash[String, T.untyped]).returns(T::Boolean) }
def should_process_post?(embed_data)
type = embed_data["$type"]
if IMAGE_OR_VIDEO.include?(type)
true
elsif type == "app.bsky.embed.recordWithMedia"
embed_type = embed_data.dig("media", "$type")
IMAGE_OR_VIDEO.include?(embed_type)
else
false
end
end
sig do
params(
post: Domain::Post::BlueskyPost,
embed_data: T::Hash[String, T.untyped],
did: String,
).void
end
def process_post_media(post, embed_data, did)
case embed_data["$type"]
when "app.bsky.embed.images"
process_post_images(post, embed_data, did)
when "app.bsky.embed.video"
process_post_video(post, embed_data, did)
when "app.bsky.embed.recordWithMedia"
embed_type = embed_data.dig("media", "$type")
if embed_type == "app.bsky.embed.images"
process_post_images(post, embed_data["media"], did)
elsif embed_type == "app.bsky.embed.video"
process_post_video(post, embed_data["media"], did)
end
end
end
sig do
params(
post: Domain::Post::BlueskyPost,
embed_data: T::Hash[String, T.untyped],
did: String,
).void
end
def process_post_images(post, embed_data, did)
images = embed_data.dig("images") || []
images.each_with_index do |image_data, index|
post_file = post.files.build(file_order: index)
set_blob_ref_and_url(post_file, image_data["image"], did)
set_aspect_ratio(post_file, image_data["aspectRatio"])
set_alt_text(post_file, image_data["alt"])
post_file.save!
@deferred_job_sink.defer_job(
Domain::StaticFileJob,
{ post_file: },
{ queue: "bluesky" },
)
logger.debug(
format_tags(
"created image for post",
make_tags(at_uri: post.at_uri, post_file_id: post_file.id),
),
)
end
end
sig do
params(
post: Domain::Post::BlueskyPost,
embed_data: T::Hash[String, T.untyped],
did: String,
).void
end
def process_post_video(post, embed_data, did)
post_file = post.files.build(file_order: 0)
set_blob_ref_and_url(post_file, embed_data["video"], did)
set_aspect_ratio(post_file, embed_data["aspectRatio"])
set_alt_text(post_file, embed_data["alt"])
post_file.save!
@deferred_job_sink.defer_job(
Domain::StaticFileJob,
{ post_file: },
{ queue: "bluesky" },
)
logger.debug(
format_tags(
"created video for post",
make_tags(at_uri: post.at_uri, post_file_id: post_file.id),
),
)
end
sig do
params(
post_file: Domain::PostFile::BlueskyPostFile,
file_data: T::Hash[String, T.untyped],
did: String,
).void
end
def set_blob_ref_and_url(post_file, file_data, did)
return unless file_data.dig("$type") == "blob"
blob_ref = file_data.dig("ref", "$link")
return unless blob_ref
post_file.blob_ref = blob_ref
post_file.url_str = self.class.construct_blob_url(did, blob_ref)
end
sig do
params(
post_file: Domain::PostFile::BlueskyPostFile,
aspect_ratio: T.nilable(T::Hash[String, T.untyped]),
).void
end
def set_aspect_ratio(post_file, aspect_ratio)
return unless aspect_ratio
post_file.aspect_ratio_width = aspect_ratio.dig("width")
post_file.aspect_ratio_height = aspect_ratio.dig("height")
end
sig do
params(
post_file: Domain::PostFile::BlueskyPostFile,
alt_text: T.nilable(String),
).void
end
def set_alt_text(post_file, alt_text)
post_file.alt_text = alt_text if alt_text
end
end

5
app/lib/bluesky/text.rb Normal file
View File

@@ -0,0 +1,5 @@
# typed: strict
# frozen_string_literal: true
module Bluesky::Text
end

View File

@@ -0,0 +1,22 @@
# typed: strict
# frozen_string_literal: true
class Bluesky::Text::Facet < T::ImmutableStruct
extend T::Sig
const :byteStart, Integer
const :byteEnd, Integer
const :features, T::Array[Bluesky::Text::FacetFeature]
sig { params(hash: T::Hash[String, T.untyped]).returns(Bluesky::Text::Facet) }
def self.from_hash(hash)
new(
byteStart: hash["index"]["byteStart"],
byteEnd: hash["index"]["byteEnd"],
features:
hash["features"].map do |feature|
Bluesky::Text::FacetFeature.from_hash(feature)
end,
)
end
end

View File

@@ -0,0 +1,56 @@
# typed: strict
# frozen_string_literal: true
module Bluesky::Text
class FacetFeature
extend T::Sig
extend T::Helpers
abstract!
sig(:final) do
params(hash: T::Hash[String, T.untyped]).returns(FacetFeature)
end
def self.from_hash(hash)
case hash["$type"]
when "app.bsky.richtext.facet#mention"
FacetFeatureMention.new(hash)
when "app.bsky.richtext.facet#link"
FacetFeatureURI.new(hash)
when "app.bsky.richtext.facet#tag"
FacetFeatureTag.new(hash)
else
raise "Unknown facet feature type: #{hash["$type"]}"
end
end
end
class FacetFeatureURI < FacetFeature
sig { returns(String) }
attr_reader :uri
sig { params(hash: T::Hash[String, T.untyped]).void }
def initialize(hash)
@uri = T.let(hash["uri"], String)
end
end
class FacetFeatureMention < FacetFeature
sig { returns(String) }
attr_reader :did
sig { params(hash: T::Hash[String, T.untyped]).void }
def initialize(hash)
@did = T.let(hash["did"], String)
end
end
class FacetFeatureTag < FacetFeature
sig { returns(String) }
attr_reader :tag
sig { params(hash: T::Hash[String, T.untyped]).void }
def initialize(hash)
@tag = T.let(hash["tag"], String)
end
end
end

View File

@@ -0,0 +1,88 @@
# typed: strict
# frozen_string_literal: true
class DeferredJobSink
extend T::Sig
include HasColorLogger
sig { params(source_class: T.untyped).void }
def initialize(source_class)
@suppressed_jobs = T.let(Set.new, T::Set[SuppressedJob])
@deferred_jobs = T.let(Set.new, T::Set[DeferredJob])
@source_class = source_class
end
sig do
params(
job_class: T.class_of(Scraper::JobBase),
params: T::Hash[Symbol, T.untyped],
set_args: T::Hash[Symbol, T.untyped],
).returns(T::Boolean)
end
def defer_job(job_class, params, set_args = {})
!!@deferred_jobs.add?(DeferredJob.new(job_class:, params:, set_args:))
end
sig do
params(
job_class: T.class_of(Scraper::JobBase),
params: T::Hash[Symbol, T.untyped],
).void
end
def suppress_deferred_job(job_class, params)
ignore_args = job_class.gather_ignore_signature_args
params_cleared =
params.reject { |key, value| ignore_args.include?(key.to_sym) }
!!@suppressed_jobs.add?(
SuppressedJob.new(job_class:, params: params_cleared),
)
end
sig do
params(
caused_by_entry: T.nilable(HttpLogEntry),
caused_by_job_id: T.nilable(String),
).void
end
def enqueue_deferred_jobs!(caused_by_entry = nil, caused_by_job_id = nil)
jobs_to_enqueue =
@deferred_jobs.filter_map do |deferred_job|
if @suppressed_jobs.any? { |suppressed_job|
if suppressed_job.matches?(deferred_job)
logger.info(
"suppressing deferred job #{deferred_job.job_class.name} with params #{deferred_job.describe_params}",
)
true
end
}
nil
else
deferred_job
end
end
GoodJob::Bulk.enqueue do
jobs_to_enqueue.each do |deferred_job|
args =
deferred_job.params.merge({ caused_by_entry:, caused_by_job_id: })
set_args = deferred_job.set_args
job = deferred_job.job_class.set(set_args).perform_later(args)
Scraper::Metrics::JobBaseMetrics.observe_job_enqueued(
source_class: @source_class,
enqueued_class: deferred_job.job_class,
)
if job
logger.info(
format_tags(
make_tag("job_class", deferred_job.job_class.name),
(make_tag("job_id", job.job_id)),
"enqueue deferred job",
),
)
end
end
rescue StandardError => e
logger.error("error enqueueing jobs: #{e.class.name} - #{e.message}")
end
end
end

View File

@@ -122,6 +122,9 @@ class Domain::Fa::Parser::Page < Domain::Fa::Parser::Base
sig { returns(ActiveSupport::TimeZone) }
def logged_in_user_tz
# server default for unauthenticated requests
return ActiveSupport::TimeZone.new("America/New_York") unless logged_in?
case logged_in_user
when "zzreg", "cottoniq"
ActiveSupport::TimeZone.new("America/Los_Angeles")

View File

@@ -170,7 +170,23 @@ class Domain::Fa::Parser::SubmissionParserHelper < Domain::Fa::Parser::Base
end
when VERSION_2
date_str = @elem.css(".popup_date").first["title"]
time_zone_offset.strptime(date_str, "%b %d, %Y %I:%M %p") if date_str
if date_str
[
# version 2, pre September 2025 - formatted like "Jan 20, 2025 11:23 AM"
"%b %d, %Y %I:%M %p",
# version 2, post September 2025 - formatted like "September 7, 2025, 10:48:53"
"%B %e, %Y, %H:%M:%S",
].lazy
.map do |format|
begin
time_zone_offset.strptime(date_str, format)
rescue ArgumentError
nil
end
end
.find(&:present?) ||
raise(ArgumentError.new("invalid date string: `#{date_str}`"))
end
else
raise("unimplemented version #{@page_version}")
end

View File

@@ -425,6 +425,7 @@ class Domain::Fa::Parser::UserPageHelper < Domain::Fa::Parser::Base
href = link_elem["href"]
url_name =
%r{/user/(.+)/}.match(href)&.[](1) || raise("invalid url: #{href}")
url_name = CGI.unescape(url_name)
if @march_2025_update
name =

View File

@@ -45,8 +45,8 @@ class EnqueueJobBase < Tasks::InterruptableTask
def start_enqueuing
end
sig { params(block: T.proc.void).void }
def enqueue(&block)
sig { params(always_recheck: T::Boolean, block: T.proc.void).void }
def enqueue(always_recheck: false, &block)
# if we're under the high water mark, we can just enqueue and return
# so we get called again as soon as possible
if @inferred_queue_size < high_water_mark
@@ -64,7 +64,11 @@ class EnqueueJobBase < Tasks::InterruptableTask
end
block.call
@inferred_queue_size += 1
if always_recheck
@inferred_queue_size = queue_size
else
@inferred_queue_size += 1
end
@total_performed += 1
end

View File

@@ -27,6 +27,11 @@ module HasColorLogger
self.class.make_tag(tag_name, tag_value)
end
sig { params(tags: T::Hash[Symbol, T.untyped]).returns(T::Array[String]) }
def make_tags(tags)
self.class.make_tags(tags)
end
sig { params(tags: T.any(String, T::Array[String])).returns(String) }
def format_tags(*tags)
self.class.format_tags(*T.unsafe([tags].flatten))
@@ -47,15 +52,20 @@ module HasColorLogger
module ClassMethods
extend T::Sig
sig { params(tags: T::Hash[Symbol, T.untyped]).returns(T::Array[String]) }
def make_tags(tags)
tags.map { |tag_name, tag_value| make_tag(tag_name.to_s, tag_value) }
end
sig { params(tag_name: String, tag_value: T.untyped).returns(String) }
def make_tag(tag_name, tag_value)
tag_value_str = tag_value ? tag_value.to_s.bold : "(nil)".italic
"#{tag_name}: #{tag_value_str}"
end
sig { params(tags: String).returns(String) }
sig { params(tags: T.any(String, T::Array[String])).returns(String) }
def format_tags(*tags)
format_tags_arr(tags)
format_tags_arr(T.unsafe([tags].flatten))
end
sig { params(tags: T::Array[String]).returns(String) }

View File

@@ -6,6 +6,9 @@ class LoadedMedia
extend T::Helpers
abstract!
class FileNotFound < StandardError
end
sig do
params(content_type: String, media_path: String).returns(
T.nilable(LoadedMedia),
@@ -23,8 +26,8 @@ class LoadedMedia
)
when %r{image/jpeg}, %r{image/jpg}, %r{image/png}, %r{image/bmp}
LoadedMedia::StaticImage.new(media_path)
when %r{video/webm}
LoadedMedia::Webm.new(media_path)
when %r{video/webm}, %r{video/mp4}
LoadedMedia::WebmOrMp4.new(media_path)
else
return nil
end

View File

@@ -5,6 +5,12 @@ class LoadedMedia::StaticImage < LoadedMedia
sig { params(media_path: String).void }
def initialize(media_path)
@vips_image = T.let(Vips::Image.new_from_file(media_path), Vips::Image)
rescue Vips::Error => e
if e.message.include?("does not exist")
raise LoadedMedia::FileNotFound.new(e)
end
raise
end
sig { override.returns(Integer) }

View File

@@ -1,7 +1,7 @@
# typed: strict
# frozen_string_literal: true
class LoadedMedia::Webm < LoadedMedia
class LoadedMedia::WebmOrMp4 < LoadedMedia
include HasColorLogger
::FFMPEG.logger = Logger.new(nil)
@@ -74,42 +74,33 @@ class LoadedMedia::Webm < LoadedMedia
File.join(BlobFile::TMP_DIR, "webm-#{frame}-#{SecureRandom.uuid}.png")
FileUtils.mkdir_p(File.dirname(tmp_path))
# @media.screenshot(tmp_path, { seek_time: frame_time })
# Always seek from the beginning for simplicity and reliability
# Add a small safety margin for the last frame to avoid seeking beyond video duration
safe_seek_time = [frame_time, @duration - 0.01].min
# Determine if we should seek from start or end based on where we are in the file
past_halfway = frame_time / @duration > 0.5
cmd = [FFMPEG_BIN, "-y", "-xerror", "-abort_on", "empty_output"] # Overwrite output files
if past_halfway
# For frames in the second half of the file, seek from the end
# Convert to a negative offset from the end
end_offset = frame_time - @duration
cmd.concat(["-sseof", end_offset.round(2).to_s])
else
# For frames in the first half, seek from the beginning
cmd.concat(["-ss", frame_time.round(2).to_s])
end
# Add input file and frame extraction options
cmd.concat(
[
"-i",
@media_path, # Input file
"-vframes",
"1", # Extract one frame
"-f",
"image2", # Force format to image2
"-update",
"1", # Update existing file
tmp_path,
],
)
cmd = [
FFMPEG_BIN,
"-y", # Overwrite output files
"-xerror", # Exit on error
"-abort_on",
"empty_output", # Abort if output is empty
"-ss",
safe_seek_time.round(2).to_s,
"-i",
@media_path, # Input file
"-vframes",
"1", # Extract one frame
"-f",
"image2", # Force format to image2
"-update",
"1", # Update existing file
tmp_path,
]
_output, error, status = Open3.capture3(*cmd)
unless status.exitstatus == 0
$stderr.print error
raise "Failed to extract frame with ffmpeg: #{error}"
raise "Failed to extract frame with ffmpeg: #{cmd.join(" ")}: #{error}"
end
# Use the original frame number in the error message

View File

@@ -65,9 +65,7 @@ class Scraper::CurlHttpPerformer
curl = get_curl
start_at = Time.now
# TODO - normalizing the URL breaks URLs with utf-8 characters
# curl.url = request.uri.normalize.to_s
curl.url = request.uri.to_s
curl.url = request.uri.normalize.to_s
curl.follow_location = request.follow_redirects
request.request_headers.each { |key, value| curl.headers[key.to_s] = value }
curl.headers["User-Agent"] = "FurryArchiver/1.0 / telegram: @DeltaNoises"

View File

@@ -25,8 +25,7 @@ module Scraper::Metrics::JobBaseMetrics
sig do
params(
source_class:
T.any(T.class_of(Scraper::JobBase), T.class_of(ReduxApplicationRecord)),
source_class: T.untyped,
enqueued_class: T.class_of(Scraper::JobBase),
).void
end

View File

@@ -42,7 +42,7 @@ module Stats::Helpers
records_array.map do |record|
Stats::DataPoint.new(
x: record.fav_id.to_f,
x: record.fa_fav_id.to_f,
y: T.must(record.explicit_time).to_f,
)
end

View File

@@ -12,22 +12,42 @@ module Tasks::Bluesky
@pg_notify = pg_notify
@resolver = T.let(DIDKit::Resolver.new, DIDKit::Resolver)
@dids = T.let(Concurrent::Set.new, Concurrent::Set)
@dids.merge(Bluesky::MonitoredDid.pluck(:did))
logger.info(
"loaded #{@dids.size} #{"did".pluralize(@dids.size)} from database",
@hashtags = T.let(Concurrent::Set.new, Concurrent::Set)
@dids.merge(
Domain::Bluesky::MonitoredObject.where(kind: :user_did).pluck(:value),
)
logger.info("dids: #{@dids.to_a.join(", ")}")
@hashtags.merge(
Domain::Bluesky::MonitoredObject
.where(kind: :hashtag)
.pluck(:value)
.map(&:downcase),
)
logger.info(
format_tags("loaded dids", make_tags(num_dids: @dids.to_a.size)),
)
logger.info(
format_tags(
"loaded hashtags",
make_tags(
num_hashtags: @hashtags.to_a.size,
hashtags: @hashtags.to_a.inspect,
),
),
)
cursor = load_cursor
logger.info(format_tags("using cursor", make_tags(cursor:)))
@bluesky_client =
T.let(
Skyfall::Jetstream.new(
"jetstream2.us-east.bsky.network",
{
cursor: nil,
# cursor: load_cursor,
cursor:,
wanted_collections: %w[
app.bsky.feed.post
app.bsky.embed.images
app.bsky.embed.video
app.bsky.embed.recordWithMedia
],
},
@@ -39,23 +59,29 @@ module Tasks::Bluesky
sig { void }
def run
@bluesky_client.on_connecting { logger.info("connecting...") }
@bluesky_client.on_connect { logger.info("connected") }
@bluesky_client.on_disconnect { logger.info("disconnected") }
@bluesky_client.on_connecting do
logger.info(format_tags("connecting..."))
end
@bluesky_client.on_connect { logger.info(format_tags("connected")) }
@bluesky_client.on_disconnect { logger.info(format_tags("disconnected")) }
@bluesky_client.on_reconnect do
logger.info("connection lost, trying to reconnect...")
logger.info(format_tags("connection lost, trying to reconnect..."))
end
@bluesky_client.on_timeout do
logger.info("connection stalled, triggering a reconnect...")
logger.info(
format_tags("connection stalled, triggering a reconnect..."),
)
end
@bluesky_client.on_message do |msg|
handle_message(msg)
if msg.seq % 10_000 == 0
logger.info("saving cursor: #{msg.seq.to_s.bold}")
logger.info(format_tags("saving cursor", make_tags(seq: msg.seq)))
save_cursor(msg.seq)
end
end
@bluesky_client.on_error { |e| logger.error("ERROR: #{e.to_s.red.bold}") }
@bluesky_client.on_error do |e|
logger.error(format_tags("ERROR", make_tags(error: e)))
end
# Start the thread to listen to postgres NOTIFYs to add to the @dids set
pg_notify_thread =
@@ -63,12 +89,12 @@ module Tasks::Bluesky
@bluesky_client.connect
rescue Interrupt
logger.info("shutting down...")
logger.info(format_tags("shutting down..."))
@bluesky_client.disconnect
@bluesky_client.close
pg_notify_thread&.raise(Interrupt)
pg_notify_thread&.join
logger.info("shutdown complete")
logger.info(format_tags("shutdown complete"))
end
sig { params(msg: Skyfall::Jetstream::Message).void }
@@ -82,30 +108,85 @@ module Tasks::Bluesky
sig { params(msg: Skyfall::Jetstream::CommitMessage).void }
def handle_commit_message(msg)
return unless msg.type == :commit
return unless @dids.include?(msg.did)
msg.operations.each do |op|
next unless op.action == :create && op.type == :bsky_post
# Check if we should process this post (either from monitored DID or contains monitored hashtags)
from_monitored_did = @dids.include?(msg.did)
post_text = T.let(op.raw_record["text"], T.nilable(String)) || ""
post_hashtags = extract_hashtags(post_text)
has_monitored_hashtag = !(post_hashtags & @hashtags.to_a).empty?
log_args =
(
if from_monitored_did
{ reason: "did", did: msg.did }
else
{
reason: "hashtag",
hashtags: (post_hashtags & @hashtags.to_a).inspect,
}
end
)
next unless from_monitored_did || has_monitored_hashtag
deferred_job_sink = DeferredJobSink.new(self.class)
helper = Bluesky::ProcessPostHelper.new(deferred_job_sink)
embed_data =
T.let(op.raw_record["embed"], T.nilable(T::Hash[String, T.untyped]))
next unless embed_data
unless helper.should_process_post?(embed_data)
logger.info(
format_tags(
"skipping post",
make_tags(**log_args),
make_tags(uri: op.uri),
make_tags(type: embed_data["$type"]),
),
)
next
end
logger.info(
format_tags(
"processing post",
make_tags(**log_args),
make_tags(uri: op.uri),
),
)
post =
Domain::Post::BlueskyPost.find_or_create_by!(at_uri: op.uri) do |post|
post.bluesky_rkey = op.rkey
post.text = op.raw_record["text"]
post.bluesky_created_at = msg.time.in_time_zone("UTC")
post.rkey = op.rkey
post.text = post_text
post.posted_at = msg.time.in_time_zone("UTC")
post.creator = creator_for(msg)
post.post_raw = op.raw_record
post.monitor_scanned_at = Time.current
end
process_media(post, embed_data, msg.did)
helper.process_post_media(post, embed_data, msg.did)
logger.info(
"created bluesky post: `#{post.bluesky_rkey}` / `#{post.at_uri}`",
format_tags(
"created bluesky post",
make_tags(rkey: post.rkey),
make_tags(at_uri: post.at_uri),
),
)
ensure
deferred_job_sink.enqueue_deferred_jobs! if deferred_job_sink
end
end
sig { params(text: String).returns(T::Array[String]) }
def extract_hashtags(text)
# Extract hashtags from text (matches #word or #word123 but not #123)
hashtags = text.scan(/#([a-zA-Z]\w*)/).flatten
# Convert to lowercase for case-insensitive matching
hashtags.map(&:downcase)
end
sig do
params(msg: Skyfall::Jetstream::CommitMessage).returns(
T.nilable(Domain::User::BlueskyUser),
@@ -113,12 +194,20 @@ module Tasks::Bluesky
end
def creator_for(msg)
did = msg.did
Domain::User::BlueskyUser.find_or_create_by!(did:) do |creator|
creator = Domain::User::BlueskyUser.find_or_initialize_by(did:)
if creator.new_record?
creator.handle = @resolver.get_validated_handle(did) || did
logger.info(
"created bluesky user: `#{creator.handle}` / `#{creator.did}`",
format_tags(
"created bluesky user",
make_tags(handle: creator.handle),
make_tags(did: creator.did),
),
)
creator.save!
Domain::Bluesky::Job::ScanUserJob.perform_later(user: creator)
end
creator
end
sig { returns(T.nilable(Integer)) }
@@ -133,134 +222,50 @@ module Tasks::Bluesky
sig { void }
def listen_to_postgres_notifies
logger.info("listening to postgres NOTIFYs")
logger.info(format_tags("listening to postgres NOTIFYs"))
ActiveRecord::Base.connection_pool.with_connection do |conn|
conn = T.cast(conn, ActiveRecord::ConnectionAdapters::PostgreSQLAdapter)
conn.exec_query("LISTEN #{Bluesky::MonitoredDid::ADDED_NOTIFY_CHANNEL}")
conn.exec_query(
"LISTEN #{Bluesky::MonitoredDid::REMOVED_NOTIFY_CHANNEL}",
"LISTEN #{Domain::Bluesky::MonitoredObject::ADDED_NOTIFY_CHANNEL}",
)
conn.raw_connection.wait_for_notify do |event, pid, payload|
logger.info("NOTIFY: #{event} / pid: #{pid} / payload: #{payload}")
case event
when Bluesky::MonitoredDid::ADDED_NOTIFY_CHANNEL
@dids.add(payload)
when Bluesky::MonitoredDid::REMOVED_NOTIFY_CHANNEL
@dids.delete(payload)
conn.exec_query(
"LISTEN #{Domain::Bluesky::MonitoredObject::REMOVED_NOTIFY_CHANNEL}",
)
loop do
conn.raw_connection.wait_for_notify do |event, pid, payload|
kind, value = payload.split("/", 2)
logger.info(
format_tags(
"NOTIFY",
make_tags(event: event),
make_tags(pid: pid),
make_tags(payload: payload),
make_tags(kind: kind),
make_tags(value: value),
),
)
next unless kind && value
case event
when Domain::Bluesky::MonitoredObject::ADDED_NOTIFY_CHANNEL
@dids.add(value) if kind == "user_did"
@hashtags.add(value.downcase) if kind == "hashtag"
when Domain::Bluesky::MonitoredObject::REMOVED_NOTIFY_CHANNEL
@dids.delete(value) if kind == "user_did"
@hashtags.delete(value.downcase) if kind == "hashtag"
end
end
end
rescue Interrupt
logger.info("interrupt in notify thread...")
logger.info(format_tags("interrupt in notify thread"))
ensure
logger.info("unlistening to postgres NOTIFYs")
logger.info(format_tags("unlistening to postgres NOTIFYs"))
conn.exec_query(
"UNLISTEN #{Bluesky::MonitoredDid::ADDED_NOTIFY_CHANNEL}",
"UNLISTEN #{Domain::Bluesky::MonitoredObject::ADDED_NOTIFY_CHANNEL}",
)
conn.exec_query(
"UNLISTEN #{Bluesky::MonitoredDid::REMOVED_NOTIFY_CHANNEL}",
"UNLISTEN #{Domain::Bluesky::MonitoredObject::REMOVED_NOTIFY_CHANNEL}",
)
end
end
private
sig do
params(
post: Domain::Post::BlueskyPost,
embed_data: T::Hash[String, T.untyped],
did: String,
).void
end
def process_media(post, embed_data, did)
case embed_data["$type"]
when "app.bsky.embed.images"
process_images(post, embed_data["images"], did)
when "app.bsky.embed.recordWithMedia"
# Handle quote posts with media
if embed_data["media"] &&
embed_data["media"]["$type"] == "app.bsky.embed.images"
process_images(post, embed_data["media"]["images"], did)
end
when "app.bsky.embed.external"
# Handle external embeds (website cards) - could have thumbnail images
process_external_embed(post, embed_data["external"], did)
else
logger.debug("unknown embed type: #{embed_data["$type"]}")
end
end
sig do
params(
post: Domain::Post::BlueskyPost,
images: T::Array[T::Hash[String, T.untyped]],
did: String,
).void
end
def process_images(post, images, did)
files = []
images.each_with_index do |image_data, index|
blob_data = image_data["image"]
next unless blob_data && blob_data["ref"]
# Create PostFile record for the image
post_file =
post.files.build(
type: "Domain::PostFile::BlueskyPostFile",
file_order: index,
url_str: construct_blob_url(did, blob_data["ref"]["$link"]),
state: "pending",
alt_text: image_data["alt"],
blob_ref: blob_data["ref"]["$link"],
)
# Store aspect ratio if present
if image_data["aspectRatio"]
post_file.aspect_ratio_width = image_data["aspectRatio"]["width"]
post_file.aspect_ratio_height = image_data["aspectRatio"]["height"]
end
post_file.save!
Domain::StaticFileJob.perform_later({ post_file: })
files << post_file
end
logger.info(
"created #{files.size} #{"file".pluralize(files.size)} for post: #{post.bluesky_rkey} / #{did}",
)
end
sig do
params(
post: Domain::Post::BlueskyPost,
external_data: T::Hash[String, T.untyped],
did: String,
).void
end
def process_external_embed(post, external_data, did)
# Handle thumbnail image from external embeds (website cards)
thumb_data = external_data["thumb"]
return unless thumb_data && thumb_data["ref"]
post_file =
post.files.build(
type: "Domain::PostFile::BlueskyPostFile",
file_order: 0,
url_str: construct_blob_url(did, thumb_data["ref"]["$link"]),
state: "pending",
)
# Store metadata
post_file.alt_text = "Website preview thumbnail"
post_file.blob_ref = thumb_data["ref"]["$link"]
post_file.save!
logger.info("created bluesky external thumbnail: #{post_file.url_str}")
end
sig { params(did: String, cid: String).returns(String) }
def construct_blob_url(did, cid)
# Construct the Bluesky blob URL using the AT Protocol getBlob endpoint
"https://bsky.social/xrpc/com.atproto.sync.getBlob?did=#{did}&cid=#{cid}"
end
end
end

View File

@@ -48,7 +48,7 @@ module Tasks
def run_post_file_descending(start_at)
last_post_file_id = get_progress(start_at)&.to_i
query = Domain::PostFile.where(state: "ok").includes(:blob)
query = Domain::PostFile.where(state: "ok").includes(:blob, :thumbnails)
query = query.where(id: ..last_post_file_id) if last_post_file_id
log("counting post files to process...")

View File

@@ -0,0 +1,5 @@
# typed: strict
# frozen_string_literal: true
module Tasks::Inkbunny
end

View File

@@ -0,0 +1,95 @@
# typed: strict
# frozen_string_literal: true
class Tasks::Inkbunny::EnqueueMissingPostsTask < EnqueueJobBase
extend T::Sig
include HasColorLogger
include Domain::Fa::HasCountFailedInQueue
sig { override.returns(String) }
def progress_key
"task-inkbunny-enqueue-missing-posts"
end
sig do
override
.params(
perform_max: T.nilable(Integer),
start_at: T.nilable(T.any(Integer, String)),
log_sink: T.any(IO, StringIO),
)
.void
end
def initialize(perform_max: nil, start_at: nil, log_sink: $stderr)
super(perform_max:, log_sink:)
@start_at =
T.let(
get_progress(start_at&.to_s)&.to_i ||
T.cast(Domain::Post::InkbunnyPost.maximum(:ib_id), Integer),
Integer,
)
end
sig { override.void }
def start_enqueuing
log("starting from ib_id: #{@start_at}") if @start_at
total_processed = 0
max_ib_post_id = @start_at
loop do
min_ib_post_id = [max_ib_post_id - 10_000, 0].max
missing_ib_post_ids_sql = <<~SQL
SELECT series.id
FROM generate_series(#{min_ib_post_id}, #{max_ib_post_id}) AS series(id)
LEFT JOIN domain_posts_ib_aux AS posts
ON series.id = posts.ib_id
WHERE posts.ib_id IS NULL
ORDER BY series.id DESC
LIMIT 100
SQL
missing_ib_post_ids =
ActiveRecord::Base
.connection
.execute(missing_ib_post_ids_sql)
.values
.flatten
.map(&:to_i)
missing_ib_post_ids = T.cast(missing_ib_post_ids, T::Array[Integer])
if found_min_id = missing_ib_post_ids.min
enqueue(always_recheck: true) do
Domain::Inkbunny::Job::UpdatePostsJob.perform_now(
ib_post_ids: missing_ib_post_ids,
)
end
# Move to continue from the lowest ID we just processed
max_ib_post_id = found_min_id
total_processed += missing_ib_post_ids.size
logger.info(
format_tags(
make_tags(total_processed:, this_batch: missing_ib_post_ids.size),
),
)
else
# No missing IDs found in this large range, move the window down
max_ib_post_id = min_ib_post_id
end
# Stop if we've reached the beginning
max_ib_post_id = [max_ib_post_id, 0].max
save_progress(max_ib_post_id.to_s)
logger.info("saved progress: #{max_ib_post_id}")
break if max_ib_post_id <= 0
break if interrupted?
end
end
sig { override.returns(Integer) }
def queue_size
count_failed_in_queue(%w[inkbunny static_file])
end
end

View File

@@ -47,10 +47,22 @@ module Tasks
end
rescue Telegram::Bot::Exceptions::ResponseError => e
log("❌ Telegram API error: #{e.message}")
log("Please check your bot token configuration.")
if e.error_code == 429
retry_after = e.response.dig("parameters", "retry_after")&.to_i || 10
log(
"❌ Rate limit exceeded, waiting #{retry_after} seconds and trying again.",
)
sleep(retry_after)
retry
else
raise e
end
rescue Interrupt
log("User interrupted the bot")
rescue StandardError => e
log("❌ Unexpected error: #{e.message}")
log("Backtrace: #{e.backtrace&.first(5)&.join("\n")}")
raise e
ensure
log("🛑 Telegram bot stopped")
end
@@ -75,7 +87,18 @@ module Tasks
).void
end
def handle_message(bot, message)
return unless message.photo || message.document
unless message.photo || message.document || message.video
bot.api.send_message(
chat_id: message.chat.id,
text: [
"👋 Hi! I'm a visual search bot.",
"📷 Send me an <b>image</b> or <b>video</b> and I'll search for similar content in my database.",
].join("\n\n"),
parse_mode: "HTML",
reply_to_message_id: message.message_id,
)
return
end
# Start timing the total request
total_request_timer = Stopwatch.start
@@ -89,14 +112,14 @@ module Tasks
response_message =
bot.api.send_message(
chat_id: chat_id,
text: "🔍 Analyzing image... Please wait...",
text: "🔍 Analyzing... Please wait...",
reply_to_message_id: message.message_id,
)
begin
# Process the image and perform visual search
search_result, processed_blob =
process_image_message_with_logging(bot, message, telegram_log)
process_media_message_with_logging(bot, message, telegram_log)
if search_result
if search_result.empty?
@@ -114,7 +137,7 @@ module Tasks
)
else
result_text =
"❌ Could not process the image. Please make sure it's a valid image file."
"❌ Could not process the file. Please make sure it's a valid image or video file."
# Update log with invalid image
update_telegram_log_invalid_image(telegram_log, result_text)
@@ -125,7 +148,7 @@ module Tasks
chat_id: chat_id,
message_id: response_message.message_id,
text: result_text,
parse_mode: "Markdown",
parse_mode: "HTML",
)
# Record total request time
@@ -133,7 +156,7 @@ module Tasks
telegram_log.update!(total_request_time: total_request_time)
log("⏱️ Total request completed in #{total_request_timer.elapsed_s}")
rescue StandardError => e
log("Error processing image: #{e.message}")
log("Error processing file: #{e.message}")
# Update log with error
update_telegram_log_error(telegram_log, e)
@@ -143,7 +166,7 @@ module Tasks
chat_id: chat_id,
message_id: response_message.message_id,
text:
"❌ An error occurred while processing your image. Please try again.",
"❌ An error occurred while processing your file. Please try again.",
)
# Record total request time even for errors
@@ -169,21 +192,21 @@ module Tasks
],
)
end
def process_image_message_with_logging(bot, message, telegram_log)
log("📥 Received image message from chat #{message.chat.id}")
def process_media_message_with_logging(bot, message, telegram_log)
log("📥 Received message from chat #{message.chat.id}")
# Get the largest photo or document
image_file = get_image_file_from_message(message)
return nil, nil unless image_file
media_file = get_media_file_from_message(message)
return nil, nil unless media_file
# Download the image to a temporary file
download_stopwatch = Stopwatch.start
temp_file = download_telegram_image(bot, image_file)
temp_file = download_telegram_file(bot, media_file)
download_time = download_stopwatch.elapsed
return nil, nil unless temp_file
log("📥 Downloaded image in #{download_stopwatch.elapsed_s}")
log("📥 Downloaded file in #{download_stopwatch.elapsed_s}")
processed_blob = nil
@@ -196,11 +219,13 @@ module Tasks
# Create BlobFile for the processed image
content_type =
case image_file
case media_file
when Telegram::Bot::Types::Document
image_file.mime_type || "application/octet-stream"
media_file.mime_type || "application/octet-stream"
when Telegram::Bot::Types::PhotoSize
"image/jpeg" # Telegram photos are typically JPEG
when Telegram::Bot::Types::Video
media_file.mime_type || "video/mp4"
else
"application/octet-stream"
end
@@ -213,16 +238,19 @@ module Tasks
image_processing_time = image_processing_stopwatch.elapsed
log("🔧 Processed image in #{image_processing_stopwatch.elapsed_s}")
log("🔧 Processed file in #{image_processing_stopwatch.elapsed_s}")
# Time fingerprint generation
fingerprint_stopwatch = Stopwatch.start
fingerprint_value =
Domain::PostFile::BitFingerprint.from_file_path(file_path)
detail_fingerprint_value =
Domain::PostFile::BitFingerprint.detail_from_file_path(file_path)
temp_dir = Dir.mktmpdir("telegram-bot-task-visual-search")
fingerprints = generate_fingerprints(file_path, content_type, temp_dir)
fingerprint_computation_time = fingerprint_stopwatch.elapsed
if fingerprints.nil?
log("❌ Error generating fingerprints")
return nil, nil
end
log(
"🔍 Generated fingerprints in #{fingerprint_stopwatch.elapsed_s}, searching for similar images...",
)
@@ -231,8 +259,7 @@ module Tasks
search_stopwatch = Stopwatch.start
similar_results =
find_similar_fingerprints(
fingerprint_value: fingerprint_value,
fingerprint_detail_value: detail_fingerprint_value,
fingerprints.map(&:to_fingerprint_and_detail),
limit: 10,
oversearch: 3,
includes: {
@@ -259,10 +286,11 @@ module Tasks
[high_quality_matches, processed_blob]
rescue StandardError => e
log("❌ Error processing image: #{e.message}")
log("❌ Error processing file: #{e.message}")
[nil, processed_blob]
ensure
# Clean up temp file
# Clean up temp files
FileUtils.rm_rf(temp_dir) if temp_dir
temp_file.unlink if temp_file
end
end
@@ -331,10 +359,10 @@ module Tasks
telegram_log.update!(
status: :invalid_image,
search_results_count: 0,
error_message: "Invalid or unsupported image format",
error_message: "Invalid or unsupported file format",
response_data: {
response_text: response_text,
error: "Invalid image format",
error: "Invalid file format",
},
)
end
@@ -381,9 +409,10 @@ module Tasks
external_url = post.external_url_for_view
if (title = post.title) && external_url
response += " - [#{post.to_param} - #{title}](#{external_url})"
response +=
" - <a href=\"#{external_url}\">#{post.to_param} - #{html_escape(title)}</a>"
elsif external_url
response += " - [#{post.to_param}](#{external_url})"
response += " - <a href=\"#{external_url}\">#{post.to_param}</a>"
else
response += " - #{post.to_param}"
end
@@ -391,9 +420,10 @@ module Tasks
if post.respond_to?(:creator) && (creator = post.send(:creator))
url = creator.external_url_for_view
if url
response += " by [#{creator.name_for_view}](#{url})"
response +=
" by <a href=\"#{url}\">#{html_escape(creator.name_for_view)}</a>"
else
response += " by #{creator.name_for_view}"
response += " by #{html_escape(creator.name_for_view)}"
end
end
response += "\n"
@@ -402,32 +432,46 @@ module Tasks
response
end
sig { params(text: String).returns(String) }
def html_escape(text)
# Only escape characters that are explicitly required by Telegram Bot API
# All <, > and & symbols that are not part of a tag or HTML entity must be replaced
# API supports only these named HTML entities: &lt;, &gt;, &amp; and &quot;
text
.gsub("&", "&amp;") # Ampersand (must be first to avoid double-escaping)
.gsub("<", "&lt;") # Less than
.gsub(">", "&gt;") # Greater than
end
# Extract image file information from Telegram message
sig do
params(message: Telegram::Bot::Types::Message).returns(
T.nilable(
T.any(
Telegram::Bot::Types::PhotoSize,
Telegram::Bot::Types::Video,
Telegram::Bot::Types::Document,
),
),
)
end
def get_image_file_from_message(message)
def get_media_file_from_message(message)
if message.photo && message.photo.any?
# Get the largest photo variant
message.photo.max_by { |photo| photo.file_size || 0 }
elsif message.video
message.video
elsif message.document
# Check if document is an image
content_type = message.document.mime_type
if content_type&.start_with?("image/")
if content_type&.start_with?("image/") || content_type == "video/mp4"
message.document
else
log("❌ Document is not an image: #{content_type}")
log("❌ Document is not an image or video: #{content_type}")
nil
end
else
log("❌ No image found in message")
log("❌ No image or video found in message")
nil
end
end
@@ -439,11 +483,12 @@ module Tasks
file_info:
T.any(
Telegram::Bot::Types::PhotoSize,
Telegram::Bot::Types::Video,
Telegram::Bot::Types::Document,
),
).returns(T.nilable(Tempfile))
end
def download_telegram_image(bot, file_info)
def download_telegram_file(bot, file_info)
bot_token = get_bot_token
return nil unless bot_token
@@ -459,7 +504,7 @@ module Tasks
# Download the file
file_url = "https://api.telegram.org/file/bot#{bot_token}/#{file_path}"
log("📥 Downloading image from: #{file_url}...")
log("📥 Downloading file from: #{file_url}...")
uri = URI(file_url)
downloaded_data = Net::HTTP.get(uri)
@@ -468,15 +513,15 @@ module Tasks
extension = File.extname(file_path)
extension = ".jpg" if extension.empty?
temp_file = Tempfile.new(["telegram_image", extension])
temp_file = Tempfile.new(["telegram_file", extension])
temp_file.binmode
temp_file.write(downloaded_data)
temp_file.close
log("✅ Downloaded image to: #{temp_file.path}")
log("✅ Downloaded file to: #{temp_file.path}")
temp_file
rescue StandardError => e
log("❌ Error downloading image: #{e.message}")
log("❌ Error downloading file: #{e.message}")
nil
end
end

View File

@@ -1,25 +0,0 @@
# typed: strict
class Bluesky::MonitoredDid < ReduxApplicationRecord
self.table_name = "bluesky_monitored_dids"
validates :did, presence: true, uniqueness: true
ADDED_NOTIFY_CHANNEL = "bluesky_did_added"
REMOVED_NOTIFY_CHANNEL = "bluesky_did_removed"
after_create_commit :notify_monitor_added
after_destroy_commit :notify_monitor_removed
sig { void }
def notify_monitor_added
self.class.connection.execute(
"NOTIFY #{ADDED_NOTIFY_CHANNEL}, '#{self.did}'",
)
end
sig { void }
def notify_monitor_removed
self.class.connection.execute(
"NOTIFY #{REMOVED_NOTIFY_CHANNEL}, '#{self.did}'",
)
end
end

View File

@@ -0,0 +1,41 @@
# typed: strict
class Domain::Bluesky::MonitoredObject < ReduxApplicationRecord
self.table_name = "domain_bluesky_monitored_objects"
validates :value, presence: true, uniqueness: true
enum :kind, { user_did: 0, hashtag: 1 }
ADDED_NOTIFY_CHANNEL = "bluesky_object_added"
REMOVED_NOTIFY_CHANNEL = "bluesky_object_removed"
after_create_commit :notify_monitor_added
after_destroy_commit :notify_monitor_removed
sig do
params(user: Domain::User::BlueskyUser).returns(
Domain::Bluesky::MonitoredObject,
)
end
def self.build_for_user(user)
build(value: user.did!, kind: :user_did)
end
sig { params(hashtag: String).returns(Domain::Bluesky::MonitoredObject) }
def self.build_for_hashtag(hashtag)
build(value: hashtag, kind: :hashtag)
end
sig { void }
def notify_monitor_added
self.class.connection.execute(
"NOTIFY #{ADDED_NOTIFY_CHANNEL}, '#{self.kind}/#{self.value}'",
)
end
sig { void }
def notify_monitor_removed
self.class.connection.execute(
"NOTIFY #{REMOVED_NOTIFY_CHANNEL}, '#{self.kind}/#{self.value}'",
)
end
end

View File

@@ -102,11 +102,22 @@ class Domain::Post < ReduxApplicationRecord
inverse_of: :post,
dependent: :destroy
sig { params(klass: T.class_of(Domain::User)).void }
def self.has_faving_users!(klass)
self.class_has_faving_users = klass
sig do
params(
user_klass: T.class_of(Domain::User),
fav_klass: T.class_of(Domain::UserPostFav),
).void
end
def self.has_faving_users!(user_klass, fav_klass = Domain::UserPostFav)
self.class_has_faving_users = user_klass
has_many :user_post_favs,
class_name: fav_klass.name,
inverse_of: :post,
dependent: :destroy
has_many :faving_users,
class_name: klass.name,
class_name: user_klass.name,
through: :user_post_favs,
source: :user
end
@@ -132,7 +143,7 @@ class Domain::Post < ReduxApplicationRecord
sig { overridable.returns(Symbol) }
def self.post_order_attribute
:id
:posted_at
end
sig { overridable.returns(T::Boolean) }
@@ -144,6 +155,10 @@ class Domain::Post < ReduxApplicationRecord
def title
end
sig { abstract.returns(Integer) }
def num_post_files_for_view
end
sig { overridable.returns(String) }
def title_for_view
title || "(unknown)"

View File

@@ -4,6 +4,32 @@ class Domain::Post::BlueskyPost < Domain::Post
belongs_to :first_seen_entry, class_name: "::HttpLogEntry", optional: true
belongs_to :quoted_post,
class_name: "Domain::Post::BlueskyPost",
optional: true,
foreign_key: :quote_uri,
primary_key: :at_uri,
inverse_of: :quoting_posts
has_many :quoting_posts,
class_name: "Domain::Post::BlueskyPost",
foreign_key: :quote_uri,
primary_key: :at_uri,
inverse_of: :quoted_post
belongs_to :replying_to_post,
class_name: "Domain::Post::BlueskyPost",
optional: true,
foreign_key: :reply_to_uri,
primary_key: :at_uri,
inverse_of: :replying_posts
has_many :replying_posts,
class_name: "Domain::Post::BlueskyPost",
foreign_key: :reply_to_uri,
primary_key: :at_uri,
inverse_of: :replying_to_post
has_multiple_files! Domain::PostFile::BlueskyPostFile
has_single_creator! Domain::User::BlueskyUser
has_faving_users! Domain::User::BlueskyUser
@@ -14,12 +40,26 @@ class Domain::Post::BlueskyPost < Domain::Post
validates :state, presence: true
validates :at_uri, presence: true, uniqueness: true
validates :bluesky_rkey, presence: true
validates :bluesky_created_at, presence: true
validates :rkey, presence: true
before_validation { self.rkey = at_uri&.split("/")&.last }
validate :rkey_matches_at_uri
sig { void }
def rkey_matches_at_uri
return true if rkey.blank? && at_uri.blank?
return true if rkey == at_uri&.split("/")&.last
errors.add(:rkey, "does not match AT URI")
end
sig { override.returns([String, Symbol]) }
def self.param_prefix_and_attribute
["bsky", :bluesky_rkey]
["bsky", :rkey]
end
sig { override.returns(Integer) }
def num_post_files_for_view
self.files.group(:file_order).count.size
end
sig { override.returns(String) }
@@ -29,7 +69,7 @@ class Domain::Post::BlueskyPost < Domain::Post
sig { override.returns(Symbol) }
def self.post_order_attribute
:bluesky_created_at
:posted_at
end
sig { override.returns(Domain::DomainType) }
@@ -44,16 +84,19 @@ class Domain::Post::BlueskyPost < Domain::Post
sig { override.returns(T.nilable(T.any(String, Integer))) }
def domain_id_for_view
self.bluesky_rkey
self.rkey
end
sig { override.returns(T.nilable(String)) }
def primary_creator_name_fallback_for_view
self.creator&.handle
end
sig { override.returns(T.nilable(Addressable::URI)) }
def external_url_for_view
handle = self.creator&.handle
if bluesky_rkey.present? && handle.present?
Addressable::URI.parse(
"https://bsky.app/profile/#{handle}/post/#{bluesky_rkey}",
)
if rkey.present? && handle.present?
Addressable::URI.parse("https://bsky.app/profile/#{handle}/post/#{rkey}")
end
end
@@ -62,9 +105,14 @@ class Domain::Post::BlueskyPost < Domain::Post
self.files.first
end
sig { override.returns(T.nilable(HttpLogEntry)) }
def scanned_post_log_entry_for_view
self.first_seen_entry
end
sig { override.returns(T::Boolean) }
def pending_scan?
scanned_at.nil?
false
end
sig { override.returns(T.nilable(String)) }

View File

@@ -3,7 +3,7 @@ class Domain::Post::E621Post < Domain::Post
aux_table :e621
has_single_file!
has_faving_users! Domain::User::E621User
has_faving_users! Domain::User::E621User, Domain::UserPostFav::E621UserPostFav
belongs_to_groups! :pools,
Domain::PostGroup::E621Pool,
Domain::PostGroupJoin::E621PoolJoin
@@ -47,6 +47,11 @@ class Domain::Post::E621Post < Domain::Post
self.index_page_ids << log_entry.id
end
sig { override.returns(Integer) }
def num_post_files_for_view
1
end
sig { override.returns([String, Symbol]) }
def self.param_prefix_and_attribute
["e621", :e621_id]

View File

@@ -10,7 +10,7 @@ class Domain::Post::FaPost < Domain::Post
has_single_file!
has_single_creator! Domain::User::FaUser
has_faving_users! Domain::User::FaUser
has_faving_users! Domain::User::FaUser, Domain::UserPostFav::FaUserPostFav
after_initialize { self.state ||= "ok" if new_record? }
@@ -50,6 +50,11 @@ class Domain::Post::FaPost < Domain::Post
self.creator
end
sig { override.returns(Integer) }
def num_post_files_for_view
1
end
sig { override.returns(T.nilable(T.any(String, Integer))) }
def domain_id_for_view
self.fa_id

View File

@@ -135,4 +135,9 @@ class Domain::Post::InkbunnyPost < Domain::Post
def num_favorites_for_view
num_favs
end
sig { override.returns(Integer) }
def num_post_files_for_view
self.files.group(:file_order).count.size
end
end

View File

@@ -112,4 +112,10 @@ class Domain::Post::SofurryPost < Domain::Post
def scanned_post_log_entry_for_view
last_scan_log_entry
end
sig { override.returns(Integer) }
def num_post_files_for_view
# TODO - this might not be correct, but sofurry is still down
1
end
end

View File

@@ -118,7 +118,8 @@ class Domain::PostFile::Thumbnail < ReduxApplicationRecord
logger.info(format_tags(make_tag("num_frames", num_frames)))
return [] if num_frames.zero?
existing_thumb_types = post_file.thumbnails.to_a.map(&:thumb_type).uniq
existing_thumb_types =
T.cast(post_file.thumbnails.map(&:thumb_type).uniq, T::Array[String])
logger.info(
format_tags(make_tag("existing_thumb_types", existing_thumb_types)),
)
@@ -128,8 +129,10 @@ class Domain::PostFile::Thumbnail < ReduxApplicationRecord
THUMB_TYPE_TO_OPTIONS.each do |thumb_type, options|
thumb_type = thumb_type.to_s
logger.tagged(make_tag("thumb_type", thumb_type)) do
next if existing_thumb_types.include?(thumb_type)
logger.info(format_tags("creating thumbnail"))
if existing_thumb_types.include?(thumb_type)
logger.info(format_tags("thumbnail type already exists"))
next
end
# get the number of frames in the post file
frames_to_thumbnail =
@@ -141,6 +144,8 @@ class Domain::PostFile::Thumbnail < ReduxApplicationRecord
frames_to_thumbnail.each do |frame|
logger.tagged(make_tag("frame", frame)) do
logger.info(format_tags("creating thumbnail"))
thumbnail = post_file.thumbnails.build(thumb_type:, frame:)
unless thumb_file_path = thumbnail.absolute_file_path
logger.info(format_tags("unable to compute thumbnail path"))

View File

@@ -3,6 +3,7 @@ class Domain::PostGroup < ReduxApplicationRecord
extend T::Helpers
include AttrJsonRecordAliases
include HasCompositeToParam
include HasDomainType
self.table_name = "domain_post_groups"
abstract!

View File

@@ -7,4 +7,9 @@ class Domain::PostGroup::E621Pool < Domain::PostGroup
def self.param_prefix_and_attribute
["e621", :e621_id]
end
sig { override.returns(Domain::DomainType) }
def self.domain_type
Domain::DomainType::E621
end
end

View File

@@ -25,4 +25,9 @@ class Domain::PostGroup::InkbunnyPool < Domain::PostGroup
"https://inkbunny.net/submissionsviewall.php?pool_id=#{self.ib_id}"
end
end
sig { override.returns(Domain::DomainType) }
def self.domain_type
Domain::DomainType::Inkbunny
end
end

View File

@@ -26,4 +26,9 @@ class Domain::PostGroup::SofurryFolder < Domain::PostGroup
"https://www.sofurry.com/browse/folder/#{type}?by=#{owner_id}&folder=#{sofurry_id}"
end
end
sig { override.returns(Domain::DomainType) }
def self.domain_type
Domain::DomainType::Sofurry
end
end

View File

@@ -127,6 +127,16 @@ class Domain::User < ReduxApplicationRecord
has_many :followed_users, through: :user_user_follows_from, source: :to
has_many :followed_by_users, through: :user_user_follows_to, source: :from
has_many :follows_scans,
-> { where(kind: "follows").order(created_at: :asc) },
inverse_of: :user,
class_name: "Domain::UserJobEvent::FollowScan"
has_many :followed_by_scans,
-> { where(kind: "followed_by").order(created_at: :asc) },
inverse_of: :user,
class_name: "Domain::UserJobEvent::FollowScan"
sig { params(klass: T.class_of(Domain::Post)).void }
def self.has_created_posts!(klass)
self.class_has_created_posts = klass
@@ -137,9 +147,30 @@ class Domain::User < ReduxApplicationRecord
class_name: klass.name
end
sig { params(klass: T.class_of(Domain::Post)).void }
def self.has_faved_posts!(klass)
sig do
params(
klass: T.class_of(Domain::Post),
fav_model_type: T.class_of(Domain::UserPostFav),
fav_model_order: T.untyped,
).void
end
def self.has_faved_posts!(
klass,
fav_model_type = Domain::UserPostFav,
fav_model_order: nil
)
self.class_has_faved_posts = klass
has_many :user_post_favs,
-> do
rel = extending(CounterCacheWithFallback[:user_post_favs])
rel = rel.order(fav_model_order) if fav_model_order
rel
end,
class_name: fav_model_type.name,
inverse_of: :user,
dependent: :destroy
has_many :faved_posts,
-> { order(klass.param_order_attribute => :desc) },
through: :user_post_favs,
@@ -160,13 +191,11 @@ class Domain::User < ReduxApplicationRecord
sig { params(post_ids: T::Array[Integer], log_entry: HttpLogEntry).void }
def upsert_new_favs(post_ids, log_entry:)
self.class.transaction do
type = self.class.fav_model_type.name
if post_ids.any?
post_ids.each_slice(1000) do |slice|
self.user_post_favs.upsert_all(
slice.map { |post_id| { post_id:, type: } },
slice.map { |post_id| { post_id: } },
unique_by: %i[user_id post_id],
update_only: %i[type],
)
end
end
@@ -217,6 +246,7 @@ class Domain::User < ReduxApplicationRecord
inverse_of: :user
has_many :avatars,
-> { order(created_at: :desc) },
class_name: "::Domain::UserAvatar",
inverse_of: :user,
dependent: :destroy

View File

@@ -2,11 +2,20 @@
class Domain::User::BlueskyUser < Domain::User
aux_table :bluesky
due_timestamp :scanned_profile_at, 1.month
due_timestamp :scanned_posts_at, 1.week
due_timestamp :scanned_profile_at, 3.years
due_timestamp :scanned_posts_at, 3.years
has_created_posts! Domain::Post::BlueskyPost
has_faved_posts! Domain::Post::BlueskyPost
# TODO - when we scrape liked posts, add this back in
# has_faved_posts! Domain::Post::BlueskyPost
#
has_followed_users! Domain::User::BlueskyUser
has_followed_by_users! Domain::User::BlueskyUser
belongs_to :last_scan_log_entry, class_name: "HttpLogEntry", optional: true
belongs_to :last_posts_scan_log_entry,
class_name: "HttpLogEntry",
optional: true
enum :state,
{ ok: "ok", account_disabled: "account_disabled", error: "error" },
@@ -44,6 +53,11 @@ class Domain::User::BlueskyUser < Domain::User
"Bluesky"
end
sig { returns(String) }
def did!
T.must(did)
end
sig { override.returns(T.nilable(String)) }
def description_html_for_view
description
@@ -51,7 +65,12 @@ class Domain::User::BlueskyUser < Domain::User
sig { override.returns(T::Array[String]) }
def names_for_search
[display_name, handle].compact
names = [display_name, handle]
if (h = handle)&.ends_with?(".bsky.social")
names << h.split(".").first
end
names << did
names.compact
end
sig { override.returns(T.nilable(String)) }
@@ -61,7 +80,7 @@ class Domain::User::BlueskyUser < Domain::User
sig { override.returns(T.nilable(String)) }
def name_for_view
display_name || handle
display_name.present? ? display_name : "@#{handle}"
end
sig { override.returns(String) }

View File

@@ -17,7 +17,7 @@ class Domain::User::E621User < Domain::User
validates :e621_id, presence: true
validates :name, length: { minimum: 1 }, allow_nil: false
has_faved_posts! Domain::Post::E621Post
has_faved_posts! Domain::Post::E621Post, Domain::UserPostFav::E621UserPostFav
sig { override.returns([String, Symbol]) }
def self.param_prefix_and_attribute

View File

@@ -8,12 +8,6 @@ class Domain::User::FaUser < Domain::User
due_timestamp :scanned_followed_by_at, 3.months
due_timestamp :scanned_incremental_at, 1.month
# todo - set this to be the right fav model type
has_many :user_post_favs,
class_name: "Domain::UserPostFav",
foreign_key: :user_id,
inverse_of: :user
belongs_to :last_user_page_log_entry,
foreign_key: :last_user_page_id,
class_name: "::HttpLogEntry",
@@ -27,14 +21,18 @@ class Domain::User::FaUser < Domain::User
has_followed_users! Domain::User::FaUser
has_followed_by_users! Domain::User::FaUser
has_created_posts! Domain::Post::FaPost
has_faved_posts! Domain::Post::FaPost
has_faved_posts! Domain::Post::FaPost,
Domain::UserPostFav::FaUserPostFav,
fav_model_order: {
fa_fav_id: :desc,
}
enum :state,
{ ok: "ok", account_disabled: "account_disabled", error: "error" },
prefix: :state
validates :name, presence: true
validates :url_name, presence: true
validates :url_name, presence: true, uniqueness: true
validates :state, presence: true
# validates :account_status,
@@ -55,6 +53,24 @@ class Domain::User::FaUser < Domain::User
["fa", :url_name]
end
sig { returns(String) }
def sigil_for_view
case account_status
when "active"
"~"
when "suspended"
"!"
when "banned"
"-"
when "deceased"
""
when "admin"
"@"
else
"~"
end
end
sig { void }
def thing
Domain::User::FaUser.where(url_name: "test")
@@ -113,17 +129,10 @@ class Domain::User::FaUser < Domain::User
).returns(Domain::UserPostFav)
end
def update_fav_model(post_id:, fav_id: nil, explicit_time: nil)
model =
self.user_post_favs.find_by(post_id:) ||
self.user_post_favs.build(
type: self.class.fav_model_type.name,
post_id:,
)
if model.is_a?(Domain::UserPostFav::FaUserPostFav)
model.fav_id = fav_id if fav_id.present?
model.explicit_time = explicit_time if explicit_time.present?
model.save!
end
model = self.user_post_favs.find_or_initialize_by(post_id:)
model.fa_fav_id = fav_id if fav_id.present?
model.explicit_time = explicit_time.in_time_zone if explicit_time.present?
model.save!
model
end

View File

@@ -6,5 +6,5 @@ class Domain::UserJobEvent < ReduxApplicationRecord
abstract!
self.abstract_class = true
belongs_to :user, class_name: Domain::User.name
belongs_to :user, class_name: "Domain::User"
end

View File

@@ -0,0 +1,32 @@
# typed: strict
# frozen_string_literal: true
class Domain::UserJobEvent::FollowScan < Domain::UserJobEvent
self.table_name = "domain_user_job_event_follow_scans"
belongs_to :log_entry, class_name: "HttpLogEntry", optional: true
enum :state,
{ running: "running", error: "error", completed: "completed" },
prefix: true
enum :kind, { followed_by: "followed_by", follows: "follows" }, prefix: true
validates :state,
presence: true,
inclusion: {
in: %w[running error completed],
}
validates :kind, presence: true, inclusion: { in: %w[followed_by follows] }
validates :started_at, presence: true
validates :completed_at, presence: true, unless: :state_running?
validates :log_entry, presence: true, if: :state_completed?
before_validation do
self.state ||= "running" if new_record?
self.started_at ||= Time.current if new_record?
end
sig { params(json_attributes: T::Hash[Symbol, T.untyped]).void }
def update_json_attributes!(json_attributes)
self.json_attributes = json_attributes.merge(self.json_attributes)
save!
end
end

View File

@@ -0,0 +1,4 @@
class Domain::UserJobEvent::PostsScan < Domain::UserJobEvent
self.table_name = "domain_user_job_event_posts_scans"
belongs_to :log_entry, class_name: "HttpLogEntry", optional: true
end

View File

@@ -0,0 +1,4 @@
class Domain::UserJobEvent::ProfileScan < Domain::UserJobEvent
self.table_name = "domain_user_job_event_profile_scans"
belongs_to :log_entry, class_name: "HttpLogEntry", optional: true
end

View File

@@ -16,6 +16,16 @@ class Domain::UserPostFav < ReduxApplicationRecord
belongs_to :post, class_name: "Domain::Post", inverse_of: :user_post_favs
sig { params(user_klass: String, post_klass: String).void }
def self.user_post_fav_relationships(user_klass, post_klass)
belongs_to_with_counter_cache :user,
class_name: user_klass,
inverse_of: :user_post_favs,
counter_cache: :user_post_favs_count
belongs_to :post, class_name: post_klass, inverse_of: :user_post_favs
end
scope :for_post_type,
->(post_klass) do
post_klass = T.cast(post_klass, T.class_of(Domain::Post))

View File

@@ -0,0 +1,15 @@
# typed: strict
# frozen_string_literal: true
class Domain::UserPostFav::E621UserPostFav < Domain::UserPostFav
self.table_name = "domain_user_post_favs_e621"
# user_post_fav_relationships "Domain::User::E621User", "Domain::Post::E621Post"
belongs_to_with_counter_cache :user,
class_name: "Domain::User::E621User",
inverse_of: :user_post_favs,
counter_cache: :user_post_favs_count
belongs_to :post,
class_name: "Domain::Post::E621Post",
inverse_of: :user_post_favs
end

View File

@@ -1,32 +1,37 @@
# typed: strict
class Domain::UserPostFav::FaUserPostFav < Domain::UserPostFav
scope :with_explicit_time_and_id,
-> { where.not(explicit_time: nil).where.not(fav_id: nil) }
scope :with_inferred_time_and_id,
-> { where.not(inferred_time: nil).where.not(fav_id: nil) }
scope :with_fav_id, -> { where.not(fav_id: nil) }
attr_json :fav_id, :integer
attr_json :inferred_time, ActiveModelUtcTimeIntValue.new
attr_json :explicit_time, ActiveModelUtcTimeIntValue.new
validates :fav_id, uniqueness: true, if: :fav_id?
self.table_name = "domain_user_post_favs_fa"
# user_post_fav_relationships "Domain::User::FaUser", "Domain::Post::FaPost"
belongs_to_with_counter_cache :user,
class_name: "Domain::User::FaUser",
inverse_of: :user_post_favs,
counter_cache: :user_post_favs_count
belongs_to :post,
class_name: "Domain::Post::FaPost",
inverse_of: :user_post_favs
scope :with_explicit_time_and_id,
-> { where.not(explicit_time: nil).where.not(fa_fav_id: nil) }
validates :fa_fav_id, uniqueness: true, if: :fa_fav_id?
before_save :set_inferred_time
sig { void }
def set_inferred_time
if (e = explicit_time)
self.inferred_time = e
end
end
sig(:final) { override.returns(T.nilable(FavedAt)) }
def faved_at
if explicit_time.present?
return FavedAt.new(time: explicit_time, type: FavedAtType::Explicit)
if (e = explicit_time)
return(FavedAt.new(time: e.to_time, type: FavedAtType::Explicit))
end
if inferred_time.present?
return(FavedAt.new(time: inferred_time, type: FavedAtType::Inferred))
if (i = inferred_time)
return(FavedAt.new(time: i.to_time, type: FavedAtType::Inferred))
end
regression_model =
@@ -34,13 +39,13 @@ class Domain::UserPostFav::FaUserPostFav < Domain::UserPostFav
name: "fa_fav_id_and_date",
model_type: "square_root",
)
if regression_model.nil? || fav_id.nil?
if regression_model.nil? || fa_fav_id.nil?
return(
FavedAt.new(time: post&.posted_at&.to_time, type: FavedAtType::PostedAt)
)
end
date_i = regression_model.predict(fav_id.to_f).to_i
date_i = regression_model.predict(fa_fav_id.to_f).to_i
FavedAt.new(time: Time.at(date_i), type: FavedAtType::InferredNow)
end
end

View File

@@ -64,7 +64,7 @@ class ReduxApplicationRecord < ActiveRecord::Base
end
after_save do
T.bind(self, ReduxApplicationRecord)
# T.bind(self, ReduxApplicationRecord)
@after_save_deferred_jobs ||=
T.let([], T.nilable(T::Array[[DeferredJob, T.nilable(Scraper::JobBase)]]))

View File

@@ -1,3 +1,12 @@
# typed: strict
class Domain::User::BlueskyUserPolicy < Domain::UserPolicy
sig { returns(T::Boolean) }
def view_is_bluesky_user_monitored?
is_role_admin? || is_role_moderator?
end
sig { returns(T::Boolean) }
def monitor_bluesky_user?
is_role_admin?
end
end

View File

@@ -0,0 +1,9 @@
# typed: strict
class Domain::UserPostFav::FaUserPostFavPolicy < ApplicationPolicy
extend T::Sig
sig { returns(T::Boolean) }
def favorites?
true
end
end

View File

@@ -0,0 +1,13 @@
<span class="inline-block items-center gap-1 text-slate-500 hover:text-slate-700">
<%= link_to(
local_assigns[:url],
target: "_blank",
rel: "noopener noreferrer nofollow",
class: "hover:underline decoration-dotted",
) do %>
<i class="fa-solid fa-external-link-alt"></i>
<span class="text-blue-500">
<%= local_assigns[:link_text] %>
</span>
<% end %>
</span>

View File

@@ -4,12 +4,13 @@
<% visual_style = local_assigns[:visual_style] || "sky-link" %>
<% link_text = local_assigns[:link_text] || post.title_for_view %>
<% domain_icon = local_assigns[:domain_icon].nil? ? true : local_assigns[:domain_icon] %>
<% link_params = local_assigns[:link_params] || {} %>
<%=
react_component(
"PostHoverPreviewWrapper",
{
prerender: false,
props: props_for_post_hover_preview(post, link_text, visual_style, domain_icon:),
props: props_for_post_hover_preview(post, link_text, visual_style, domain_icon:, link_params:),
html_options: {
class: link_classes_for_visual_style(visual_style)
}

View File

@@ -1,10 +1,10 @@
<div
class="m-1 flex min-h-fit flex-col rounded-lg border border-slate-300 bg-slate-50 divide-y divide-slate-300 shadow-sm"
class="m-1 flex min-h-fit flex-col rounded-lg border border-slate-300 bg-slate-50 divide-y divide-slate-300 shadow-sm group sm:hover:shadow-md transition-shadow duration-200"
>
<div class="flex justify-between p-2">
<%= render "domain/posts/inline_postable_domain_link", post: post %>
</div>
<div class="flex flex-grow items-center justify-center p-2">
<div class="flex flex-grow items-center justify-center p-2 relative">
<% if (thumbnail_file = gallery_file_for_post(post)) && (thumbnail_path = thumbnail_for_post_path(post)) %>
<%= link_to domain_post_path(post) do %>
<%= image_tag thumbnail_path,
@@ -26,6 +26,12 @@
<% else %>
<span class="text-sm text-slate-500 italic">No file available</span>
<% end %>
<% if post.num_post_files_for_view > 1 %>
<div class="absolute top-0 right-0 bg-slate-600 text-slate-200 text-sm opacity-80 font-bold px-2 py-1 rounded-bl shadow-md flex items-center gap-1 group-hover:opacity-100 transition-opacity duration-200 pointer-events-none whitespace-nowrap z-10">
<i class="fa fa-images"></i>
<%= post.num_post_files_for_view %>
</div>
<% end %>
</div>
<div>
<h2 class="p-1 pb-0 text-center text-md">
@@ -52,7 +58,7 @@
<% end %>
<% end %>
<span class="flex-grow text-right">
<% user_post_fav = @post_favs && @post_favs[post.id] %>
<% user_post_fav = local_assigns[:user_post_fav] %>
<% if (faved_at = user_post_fav&.faved_at) && (time = faved_at.time) %>
<span class="flex items-center gap-1 justify-end">
<span

View File

@@ -1,8 +0,0 @@
<span>
<i class="fa-solid <%= icon_class %> mr-1"></i>
<%= label %>: <% if value.present? %>
<%= value %>
<% else %>
<span class="text-slate-400"> - </span>
<% end %>
</span>

View File

@@ -0,0 +1,11 @@
<% description_html = render_bsky_post_facets(
post.text,
post.post_raw&.dig("facets")
) %>
<%= sky_section_tag("Description") do %>
<% if description_html %>
<%= description_html %>
<% else %>
<div class="p-4 text-center text-slate-500">No description available</div>
<% end %>
<% end %>

View File

@@ -0,0 +1,27 @@
<% post = T.cast(post, Domain::Post::BlueskyPost) %>
<% if (quoted_post = post.quoted_post) %>
<span class="flex items-center">
<i class="fa-solid fa-quote-left mr-1"></i>
<%= render(
"domain/has_description_html/inline_link_domain_post",
post: quoted_post,
visual_style: "sky-link",
domain_icon: false,
link_text: "Quoted by #{post.primary_creator_name_fallback_for_view}"
)
%>
</span>
<% end %>
<% if (replying_to_post = post.replying_to_post) %>
<span class="flex items-center">
<i class="fa-solid fa-reply mr-1"></i>
<%= render(
"domain/has_description_html/inline_link_domain_post",
post: replying_to_post,
visual_style: "sky-link",
domain_icon: false,
link_text: "Replied to #{post.primary_creator_name_fallback_for_view}"
)
%>
</span>
<% end %>

View File

@@ -8,6 +8,10 @@
class: "text-blue-600 hover:underline",
target: "_blank",
title: scanned_hle.requested_at.strftime("%Y-%m-%d %H:%M:%S") %>
<% elsif post.is_a?(Domain::Post::BlueskyPost) && (sa = post.monitor_scanned_at)%>
<span title="<%= sa.strftime("%Y-%m-%d %H:%M:%S") %>">
Scanned via monitor <%= time_ago_in_words(sa) %> ago
</span>
<% else %>
Unknown when post scanned
<% end %>

View File

@@ -1,59 +1,20 @@
<% if policy(post).view_file? %>
<% ok_files = post.files.state_ok.order(:created_at).to_a %>
<% current_file = ok_files.first || post.primary_file_for_view %>
<% if ok_files.any? && current_file&.log_entry&.status_code == 200 %>
<!-- React PostFiles Component handles everything -->
<%
# Extract file index from URL parameter
idx_param = params[:idx]
initial_file_index = idx_param.present? ? idx_param.to_i : nil
%>
<%= react_component(
"PostFiles",
{
prerender: true,
props: props_for_post_files(ok_files, initial_file_index: initial_file_index),
html_options: {
id: "post-files-component"
}
}
) %>
<% elsif current_file.present? && (log_entry = current_file.log_entry) %>
<!-- Fallback for error states -->
<section id="file-display-section">
<section class="flex grow justify-center text-slate-500">
<div>
<i class="fa-solid fa-exclamation-triangle"></i>
File error
<% if log_entry.status_code == 404 %>
(404 not found)
<% else %>
(<%= log_entry.status_code %>)
<% end %>
</div>
</section>
</section>
<% elsif current_file.present? && current_file.state_pending? %>
<!-- Fallback for pending state -->
<section id="file-display-section">
<section class="flex grow justify-center text-slate-500">
<div>
<i class="fa-solid fa-file-arrow-down"></i>
File pending download
</div>
</section>
</section>
<% else %>
<!-- Fallback for no file -->
<section id="file-display-section">
<section class="flex grow justify-center overflow-clip">
<div class="text-slate-500">
<i class="fa-solid fa-file-arrow-down"></i>
No file
</div>
</section>
</section>
<% end %>
<% post_files = post.files.order(:created_at).to_a %>
<%=
react_component(
"PostFiles",
{
prerender: true,
props: props_for_post_files(
post_files:,
initial_file_index: params[:idx]&.to_i
),
html_options: {
id: "post-files-component"
}
}
)
%>
<% else %>
<section class="sky-section">
<%= link_to post.external_url_for_view.to_s,

View File

@@ -7,8 +7,10 @@
<% fprint_value = fprint&.fingerprint_value %>
<% fprint_detail_value = fprint&.fingerprint_detail_value %>
<% fprints = fprint && fprint_value && fprint_detail_value && find_similar_fingerprints(
fingerprint_value: fprint_value,
fingerprint_detail_value: fprint_detail_value,
[Domain::VisualSearchHelper::FingerprintAndDetail.new(
fingerprint: fprint_value,
detail_fingerprint: fprint_detail_value,
)],
limit: 5,
oversearch: 5,
includes: {post_file: :post},
@@ -26,6 +28,7 @@
<% fprints.each_with_index do |similar_fingerprint, index| %>
<div class="grid grid-cols-subgrid col-span-full max-w-max py-1 px-2 gap-2">
<% post = similar_fingerprint.fingerprint.post_file.post %>
<% post_file_idx = post.files.index { |f| f.id == similar_fingerprint.fingerprint.post_file_id } %>
<% creator = post.class.has_creators? ? post.creator : post.primary_fallback_creator_for_view %>
<div class="text-md items-center flex justify-end">
<div class="w-full text-center font-medium <%= match_badge_classes(similar_fingerprint.similarity_percentage) %>">
@@ -33,7 +36,9 @@
</div>
</div>
<div class="text-md self-center">
<%= render "domain/has_description_html/inline_link_domain_post", post: post, visual_style: "sky-link" %>
<%= render "domain/has_description_html/inline_link_domain_post", post: post, visual_style: "sky-link", link_params: {
idx: post_file_idx != 0 ? post_file_idx : nil,
} %>
</div>
<% if creator %>
<div class="text-md items-center">

View File

@@ -1 +1 @@
<%= render partial: "domain/posts/title_stat", locals: { label: "Rating", value: post.rating_for_view, icon_class: "fa-tag" } %>
<%= render partial: "shared/title_stat", locals: { label: "Rating", value: post.rating_for_view, icon_class: "fa-tag" } %>

View File

@@ -1,6 +1,6 @@
<%= render partial: "domain/posts/title_stat", locals: { label: "Views", value: post.num_views, icon_class: "fa-eye" } %>
<%= render partial: "domain/posts/title_stat", locals: { label: "Comments", value: post.num_comments, icon_class: "fa-comment" } %>
<%= render partial: "domain/posts/title_stat", locals: { label: "Status", value: post.status_for_view, icon_class: "fa-calendar-days" } %>
<%= render partial: "shared/title_stat", locals: { label: "Views", value: post.num_views, icon_class: "fa-eye" } %>
<%= render partial: "shared/title_stat", locals: { label: "Comments", value: post.num_comments, icon_class: "fa-comment" } %>
<%= render partial: "shared/title_stat", locals: { label: "Status", value: post.status_for_view, icon_class: "fa-calendar-days" } %>
<% if policy(post).view_tried_from_fur_archiver? %>
<% if post.fuzzysearch_checked_at? %>
<% hle = post.fuzzysearch_entry %>

View File

@@ -1,3 +1,3 @@
<%= render partial: "domain/posts/title_stat", locals: { label: "Views", value: post.num_views, icon_class: "fa-eye" } %>
<%= render partial: "domain/posts/title_stat", locals: { label: "Files", value: post.num_files, icon_class: "fa-file-image" } %>
<%= render partial: "domain/posts/title_stat", locals: { label: "Comments", value: post.num_comments, icon_class: "fa-comment" } %>
<%= render partial: "shared/title_stat", locals: { label: "Views", value: post.num_views, icon_class: "fa-eye" } %>
<%= render partial: "shared/title_stat", locals: { label: "Files", value: post.num_files, icon_class: "fa-file-image" } %>
<%= render partial: "shared/title_stat", locals: { label: "Comments", value: post.num_comments, icon_class: "fa-comment" } %>

Some files were not shown because too many files have changed in this diff Show More