目录
- aws的上传、删除s3文件以及图像识别文字功能
- 准备工作
- 安装aws cli
- 初始化配置AWS CLI
- s3存储桶开通
- 图像识别文字功能开通
- aws的sdk
- 上传文件
- 方法一
- 方法二
- 删除文件
- 图像识别文字
- 识别发票、账单这种key,value的形式
- 单纯的识别文字
- 准备工作
在安装完成之后,运行以下两个命令来验证AWS CLI是否安装成功 。参考以下示例 , 在MacOS上打开Terminal程序 。如果是Windows系统,打开cmd 。
- where aws / which aws 查看AWS CLI安装路径
- aws --version 查看AWS CLI版本
zonghan@MacBook-Pro ~ % aws --versionaws-cli/2.0.30 Python/3.7.4 Darwin/21.6.0 botocore/2.0.0dev34zonghan@MacBook-Pro ~ % which aws/usr/local/bin/aws
初始化配置AWS CLI在使用AWS CLI前,可使用aws configure命令,完成初始化配置 。zonghan@MacBook-Pro ~ % aws configureAWS Access Key ID [None]: AKIA3GRZL6WIQEXAMPLEAWS Secret Access Key [None]: k+ci5r+hAcM3x61w1exampleDefault region name [None]: ap-east-1Default output format [None]: json
- AWS Access Key ID 及AWS Secret Access Key可在AWS管理控制台获取,AWS CLI将会使用此信息作为用户名、密码连接AWS服务 。
点击AWS管理控制台右上角的用户名 --> 选择Security Credentials
文章插图
- 点击Create New Access Key以创建一对Access Key ID 及Secret Access Key,并保存(且仅能在创建时保存)
文章插图
- Default region name,用以指定要连接的AWS 区域代码 。每个AWS区域对应的代码可通过 此链接查找 。
- Default output format,用以指定命令行输出内容的格式,默认使用JSON作为所有输出的格式 。也可以使用以下任一格式:JSON(JavaScript Object Notation)YAML: 仅在 AWS CLI v2 版本中可用TextTable
s3存储桶开通该电脑配置的认证用户在aws的s3上有权限访问一个s3的存储桶,这个一般都是管理员给你开通
图像识别文字功能开通该电脑配置的认证用户在aws的Amazon Textract的权限,这个一般都是管理员给你开通
aws的sdk
import boto3from botocore.exceptions import ClientError, BotoCoreError
安装上述boto3的模块,一般会同时安装botocore模块上传文件方法一使用upload_file方法来上传文件
import loggingimport boto3from botocore.exceptions import ClientErrorimport osdef upload_file(file_path, bucket, file_name=None):"""Upload a file to an S3 bucket:param file_name: File to upload:param bucket: Bucket to upload to:param object_name: S3 object name. If not specified then file_name is used:return: True if file was uploaded, else False"""# If S3 object_name was not specified, use file_nameif object_name is None:object_name = os.path.basename(file_name)# Upload the files3_client = boto3.client('s3')# s3 = boto3.resource('s3')try:response = s3_client.upload_file(file_path, bucket, file_name)# response = s3.Bucket(bucket).upload_file(file_name, object_name)except ClientError as e:logging.error(e)return Falsereturn True
方法二使用PutObject来上传文件import loggingimport osimport boto3from botocore.exceptions import ClientError, BotoCoreErrorfrom django.conf import settingsfrom celery import shared_tasklogger = logging.getLogger(__name__)def upload_file_to_aws(file_path, bucket, file_name=None):"""Upload a file to an S3 bucket:param file_path: File to upload:param file_name: S3 object name. If not specified then file_path is used:return: True if file was uploaded, else False"""# If S3 object_name was not specified, use file_nameif file_name is None:file_name = os.path.basename(file_path)# Upload the files3 = boto3.resource('s3')try:with open(file_path, 'rb') as f:data = https://www.huyubaike.com/biancheng/f.read()obj = s3.Object(bucket, file_name)obj.put(Body=data)except BotoCoreError as e:logger.info(e)return Falsereturn True
删除文件def delete_aws_file(file_name, bucket):try:s3_client = boto3.client("s3")s3_client.delete_object(Bucket=bucket, Key=file_name)except Exception as e:logger.info(e)
图像识别文字识别发票、账单这种key,value的形式def get_labels_and_values(result, field):if "LabelDetection" in field:key = field.get("LabelDetection")["Text"]value = https://www.huyubaike.com/biancheng/field.get("ValueDetection")["Text"]if key and value:if key.endswith(":"):key = key[:-1]result.append({key: value})def process_text_detection(bucket, document):try:client = boto3.client("textract", region_name="ap-south-1")response = client.analyze_expense(Document={"S3Object": {"Bucket": bucket, "Name": document}})except Exception as e:logger.info(e)raise "An unknown error occurred on the aws service"result = {}for expense_doc in response["ExpenseDocuments"]:for line_item_group in expense_doc["LineItemGroups"]:for line_items in line_item_group["LineItems"]:for expense_fields in line_items["LineItemExpenseFields"]:get_labels_and_values(result, expense_fields)for summary_field in expense_doc["SummaryFields"]:get_labels_and_values(result, summary_field)return resultdef get_extract_info(bucket, document):return process_text_detection(bucket, document)
推荐阅读
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- 如何在QQ群文件中创建QQ群在线文档(怎么创建手机qq群文件)
- uni-app 如何优雅的使用权限认证并对本地文件上下起手
- git中 gitignore 忽略文件操作
- Spring Boot 配置 jar 包外面的 Properties 配置文件
- 研一入坑Go文件操作
- 使用开源计算引擎提升Excel格式文件处理效率
- 电脑中怎么隐藏文件夹,怎么显示隐藏的文件
- 手机隐藏文件夹怎么显示(手机隐藏文件夹怎么显示oppo)
- 电脑里隐藏的文件夹怎么找回来(电脑怎么调出隐藏文件)
- 怎样显示隐藏的文件夹(电脑怎么调出隐藏文件)