OpenProject is the leading open source project management software.
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
openproject/app/workers/extract_fulltext_job.rb

95 lines
2.9 KiB

#-- encoding: UTF-8
#-- copyright
# OpenProject is an open source project management software.
# Copyright (C) 2012-2021 the OpenProject GmbH
#
# This program is free software; you can redistribute it and/or
# modify it under the terms of the GNU General Public License version 3.
#
# OpenProject is a fork of ChiliProject, which is a fork of Redmine. The copyright follows:
# Copyright (C) 2006-2013 Jean-Philippe Lang
# Copyright (C) 2010-2013 the ChiliProject Team
#
# This program is free software; you can redistribute it and/or
# modify it under the terms of the GNU General Public License
# as published by the Free Software Foundation; either version 2
# of the License, or (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, write to the Free Software
# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.
#
# See docs/COPYRIGHT.rdoc for more details.
#++
class ExtractFulltextJob < ApplicationJob
queue_with_priority :low
def perform(attachment_id)
@attachment_id = attachment_id
@attachment = nil
@text = nil
@file = nil
@filename = nil
@language = OpenProject::Configuration.main_content_language
return unless OpenProject::Database.allows_tsv?
return unless @attachment = find_attachment(attachment_id)
init
update
ensure
FileUtils.rm @file.path if delete_file?
end
private
def init
carrierwave_uploader = @attachment.file
@file = carrierwave_uploader.local_file
@filename = carrierwave_uploader.file.filename
if @attachment.readable?
resolver = Plaintext::Resolver.new(@file, @attachment.content_type)
@text = resolver.text
end
rescue StandardError => e
Rails.logger.error(
"Failed to extract plaintext from file #{@attachment&.id} (On domain #{Setting.host_name}): #{e}: #{e.message}"
)
end
def update
Attachment
.where(id: @attachment_id)
.update_all(['fulltext = ?, fulltext_tsv = to_tsvector(?, ?), file_tsv = to_tsvector(?, ?)',
@text,
@language,
OpenProject::FullTextSearch.normalize_text(@text),
@language,
OpenProject::FullTextSearch.normalize_filename(@filename)])
rescue StandardError => e
Rails.logger.error(
"Failed to update TSV values for attachment #{@attachment&.id} (On domain #{Setting.host_name}): #{e.message[0..499]}[...]"
)
end
def find_attachment(id)
[37868] Whitelist for attachment mime types and extensions on upload (#9431) * Add setting for whitelist * Make attachments API BaseServices compatible * Add prepare service and contract * Correctly pass the filename to the UploadedFile * Add presence check to filename * Fix expected validation message * We no longer raise a multipart error when metadata is empty * Fix filesize validation on prepared uploads * Add parser error if invalid metadata json * When attachment is not saved, use filename property * Return correct error message on JSON parser erroro * Fix specs * Use attachment upload representer * Fix direct uploads mocks with new service layer * Lint * Fix export job using attachment service * Fix IFC controller using attachment prepare service * Fix export job * RenameRename params_getter to params_source * Fix mail handler using attachment service * Fix usage of attachment create service in documents * Reuse shared examples for document attachment spec * Fix stubbed attachment service in export job spec * Use admin user in backup spec * Fix export job for bim * Fix attachment integration spec * Fix issues_controller spec * Make budget resource spec reuse common examples * Fix attachment parsing representer spec * Replace prepare part of attachment spec into separate service spec * Clear cache for login spec * Convert document create/update into services * Budget services * Allow options to be passed to property twin * Remove setting author on budget initialize * Replace meetings update with services * Replace ifc models attachment handling with services * Don't check uploader if changed by system * Fix uploader being changed by system * Replace wiki page attach_files with attachable services * Replace avatar saving * Replace snapshot attach_files * Skip double validation when container present * Set snapshot through attachment service * Remove attach_files * Validate content type in contract * Enforce writing the content type without accepting user input * Expect changed content_type * Fix content of viewpoint image to get correct content type * Fix tsv spec * Add create contract spec * Bypass whitelist in internal services when conflicting with user * Fix expects in specs after whitelist bypass * Render contract errors for wiki * Add before_hook to bodied to allow to pre-authorize permissions * Budget errors from contract * Document errors from contract
3 years ago
Attachment.find_by(id: id)
end
def remote_file?
!@attachment&.file.is_a?(LocalFileUploader)
end
def delete_file?
remote_file? && @file
end
end