linux_media

mirror of https://github.com/tbsdtv/linux_media.git synced 2025-07-23 12:43:29 +02:00

Author	SHA1	Message	Date
Mustafa Ismail	30ed9ee9a1	RDMA/irdma: Do not generate SW completions for NOPs Currently, artificial SW completions are generated for NOP wqes which can generate unexpected completions with wr_id = 0. Skip the generation of artificial completions for NOPs. Fixes: `81091d7696` ("RDMA/irdma: Add SW mechanism to generate completions on error") Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Link: https://lore.kernel.org/r/20230315145231.931-2-shiraz.saleem@intel.com Signed-off-by: Leon Romanovsky <leon@kernel.org>	2023-03-19 11:37:55 +02:00
Mustafa Ismail	24419777e9	RDMA/irdma: Fix RQ completion opcode The opcode written by HW, in the RQ CQE, is the RoCEv2/iWARP protocol opcode from the received packet and not the SW opcode as currently assumed. Fix this by returning the raw operation type and queue type in the CQE to irdma_process_cqe and add 2 helpers set_ib_wc_op_sq set_ib_wc_op_rq to map IRDMA HW op types to IB op types. Note that for iWARP, only Write with Immediate is supported so the opcode can only be IB_WC_RECV_RDMA_WITH_IMM when there is immediate data present. Fixes: `b48c24c2d7` ("RDMA/irdma: Implement device supported verb APIs") Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Link: https://lore.kernel.org/r/20221115011701.1379-3-shiraz.saleem@intel.com Signed-off-by: Leon Romanovsky <leon@kernel.org>	2022-11-17 10:41:28 +02:00
Jason Gunthorpe	33331a728c	Merge tag 'v6.0' into rdma.git for-next Trvial merge conflicts against rdma.git for-rc resolved matching linux-next: drivers/infiniband/hw/hns/hns_roce_hw_v2.c drivers/infiniband/hw/hns/hns_roce_main.c https://lore.kernel.org/r/20220929124005.105149-1-broonie@kernel.org Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2022-10-06 19:48:45 -03:00
Sindhu-Devale	7f51a961f8	RDMA/irdma: Align AE id codes to correct flush code and event A number of asynchronous event (AE) ids were not aligned to the correct flush_code and event_type. Fix these up so that the correct IBV error and event codes are returned to application. Also, add handling for new AE ids like IRDMA_AE_INVALID_REQUEST to return the correct WC error code. Fixes: `44d9e52977` ("RDMA/irdma: Implement device initialization definitions") Signed-off-by: Sindhu-Devale <sindhu.devale@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Link: https://lore.kernel.org/r/20220907191324.1173-2-shiraz.saleem@intel.com Signed-off-by: Leon Romanovsky <leon@kernel.org>	2022-09-20 13:19:52 +03:00
Sindhu-Devale	6b227bd32d	RDMA/irdma: Return error on MR deregister CQP failure The MR deregister CQP can fail if an MW is bound to it. Return an appropriate error for this case. Fixes: `b48c24c2d7` ("RDMA/irdma: Implement device supported verb APIs") Signed-off-by: Sindhu-Devale <sindhu.devale@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Link: https://lore.kernel.org/r/20220906223244.1119-3-shiraz.saleem@intel.com Signed-off-by: Leon Romanovsky <leon@kernel.org>	2022-09-07 11:22:17 +03:00
Shiraz Saleem	ead54ced63	RDMA/irdma: Fix drain SQ hang with no completion SW generated completions for outstanding WRs posted on SQ after QP is in error target the wrong CQ. This causes the ib_drain_sq to hang with no completion. Fix this to generate completions on the right CQ. [ 863.969340] INFO: task kworker/u52:2:671 blocked for more than 122 seconds. [ 863.979224] Not tainted 5.14.0-130.el9.x86_64 #1 [ 863.986588] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 863.996997] task:kworker/u52:2 state:D stack: 0 pid: 671 ppid: 2 flags:0x00004000 [ 864.007272] Workqueue: xprtiod xprt_autoclose [sunrpc] [ 864.014056] Call Trace: [ 864.017575] __schedule+0x206/0x580 [ 864.022296] schedule+0x43/0xa0 [ 864.026736] schedule_timeout+0x115/0x150 [ 864.032185] __wait_for_common+0x93/0x1d0 [ 864.037717] ? usleep_range_state+0x90/0x90 [ 864.043368] __ib_drain_sq+0xf6/0x170 [ib_core] [ 864.049371] ? __rdma_block_iter_next+0x80/0x80 [ib_core] [ 864.056240] ib_drain_sq+0x66/0x70 [ib_core] [ 864.062003] rpcrdma_xprt_disconnect+0x82/0x3b0 [rpcrdma] [ 864.069365] ? xprt_prepare_transmit+0x5d/0xc0 [sunrpc] [ 864.076386] xprt_rdma_close+0xe/0x30 [rpcrdma] [ 864.082593] xprt_autoclose+0x52/0x100 [sunrpc] [ 864.088718] process_one_work+0x1e8/0x3c0 [ 864.094170] worker_thread+0x50/0x3b0 [ 864.099109] ? rescuer_thread+0x370/0x370 [ 864.104473] kthread+0x149/0x170 [ 864.109022] ? set_kthread_struct+0x40/0x40 [ 864.114713] ret_from_fork+0x22/0x30 Fixes: `81091d7696` ("RDMA/irdma: Add SW mechanism to generate completions on error") Link: https://lore.kernel.org/r/20220824154358.117-1-shiraz.saleem@intel.com Reported-by: Kamal Heib <kamalheib1@gmail.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>	2022-08-28 12:43:37 +03:00
Mustafa Ismail	36a26d1239	RDMA/irdma: Make CQP invalid state error non-critical The invalid state error returned by the Control Queue-Pair (CQP) is not a critical error. Add it to the irdma_noncrit_err_list and drop reporting it as device error message. Link: https://lore.kernel.org/r/20220705230815.265-4-shiraz.saleem@intel.com Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>	2022-07-18 10:39:43 +03:00
Jason Gunthorpe	a6f844da39	Merge tag 'v5.18' into rdma.git for-next Following patches have dependencies. Resolve the merge conflict in drivers/net/ethernet/mellanox/mlx5/core/main.c by keeping the new names for the fs functions following linux-next: https://lore.kernel.org/r/20220519113529.226bc3e2@canb.auug.org.au/ Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2022-05-24 12:40:28 -03:00
Mustafa Ismail	81091d7696	RDMA/irdma: Add SW mechanism to generate completions on error HW flushes after QP in error state is not reliable. This can lead to application hang waiting on a completion for outstanding WRs. Implement a SW mechanism to generate completions for any outstanding WR's after the QP is modified to error. This is accomplished by starting a delayed worker after the QP is modified to error and the HW flush is performed. The worker will generate completions that will be returned to the application when it polls the CQ. This mechanism only applies to Kernel applications. Link: https://lore.kernel.org/r/20220425181624.1617-1-shiraz.saleem@intel.com Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2022-05-11 15:58:40 -03:00
Mustafa Ismail	1c9043ae06	RDMA/irdma: Fix possible crash due to NULL netdev in notifier For some net events in irdma_net_event notifier, the netdev can be NULL which will cause a crash in rdma_vlan_dev_real_dev. Fix this by moving all processing to the NETEVENT_NEIGH_UPDATE case where the netdev is guaranteed to not be NULL. Fixes: `6702bc1474` ("RDMA/irdma: Fix netdev notifications for vlan's") Link: https://lore.kernel.org/r/20220425181703.1634-4-shiraz.saleem@intel.com Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2022-05-02 11:10:33 -03:00
Mustafa Ismail	6702bc1474	RDMA/irdma: Fix netdev notifications for vlan's Currently, events on vlan netdevs are being ignored. Fix this by finding the real netdev and processing the notifications for vlan netdevs. Fixes: `915cc7ac0f` ("RDMA/irdma: Add miscellaneous utility definitions") Link: https://lore.kernel.org/r/20220225163211.127-2-shiraz.saleem@intel.com Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2022-02-28 12:07:40 -04:00
Shiraz Saleem	2c4b14ea95	RDMA/irdma: Remove enum irdma_status_code Replace use of custom irdma_status_code with linux error codes. Remove enum irdma_status_code and header in which its defined. Link: https://lore.kernel.org/r/20220217151851.1518-2-shiraz.saleem@intel.com Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2022-02-23 15:24:18 -04:00
Tatyana Nikolova	10467ce09f	RDMA/irdma: Don't arm the CQ more than two times if no CE for this CQ Completion events (CEs) are lost if the application is allowed to arm the CQ more than two times when no new CE for this CQ has been generated by the HW. Check if arming has been done for the CQ and if not, arm the CQ for any event otherwise promote to arm the CQ for any event only when the last arm event was solicited. Fixes: `b48c24c2d7` ("RDMA/irdma: Implement device supported verb APIs") Link: https://lore.kernel.org/r/20211201231509.1930-2-shiraz.saleem@intel.com Signed-off-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-12-07 13:53:01 -04:00
Christophe JAILLET	117697cc93	RDMA/irdma: Fix a potential memory allocation issue in 'irdma_prm_add_pble_mem()' 'pchunk->bitmapbuf' is a bitmap. Its size (in number of bits) is stored in 'pchunk->sizeofbitmap'. When it is allocated, the size (in bytes) is computed by: size_in_bits >> 3 There are 2 issues (numbers bellow assume that longs are 64 bits): - there is no guarantee here that 'pchunk->bitmapmem.size' is modulo BITS_PER_LONG but bitmaps are stored as longs (sizeofbitmap=8 bits will only allocate 1 byte, instead of 8 (1 long)) - the number of bytes is computed with a shift, not a round up, so we may allocate less memory than needed (sizeofbitmap=65 bits will only allocate 8 bytes (i.e. 1 long), when 2 longs are needed = 16 bytes) Fix both issues by using 'bitmap_zalloc()' and remove the useless 'bitmapmem' from 'struct irdma_chunk'. While at it, remove some useless NULL test before calling kfree/bitmap_free. Fixes: `915cc7ac0f` ("RDMA/irdma: Add miscellaneous utility definitions") Link: https://lore.kernel.org/r/5e670b640508e14b1869c3e8e4fb970d78cbe997.1638692171.git.christophe.jaillet@wanadoo.fr Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Reviewed-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-12-07 13:45:48 -04:00
Jakub Kicinski	fd92213e9a	RDMA: Constify netdev->dev_addr accesses netdev->dev_addr will become const soon, make sure drivers propagate the qualifier. Link: https://lore.kernel.org/r/20211019182604.1441387-4-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Acked-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-10-25 14:33:09 -03:00
Zhu Yanjun	9d8f247cc3	RDMA/irdma: Remove irdma_cqp_up_map_cmd() The function irdma_cqp_up_map_cmd() is not used. So remove it. Link: https://lore.kernel.org/r/20211011110128.4057-5-yanjun.zhu@linux.dev Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev> Acked-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-10-12 13:21:40 -03:00
Zhu Yanjun	16ddcfca56	RDMA/irdma: Remove irdma_get_hw_addr() The function irdma_get_hw_addr() is not used. So remove it. Link: https://lore.kernel.org/r/20211011110128.4057-4-yanjun.zhu@linux.dev Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev> Acked-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-10-12 13:21:39 -03:00
Sindhu Devale	5b1e985f76	RDMA/irdma: Skip CQP ring during a reset Due to duplicate reset flags, CQP commands are processed during reset. This leads CQP failures such as below: irdma0: [Delete Local MAC Entry Cmd Error][op_code=49] status=-27 waiting=1 completion_err=0 maj=0x0 min=0x0 Remove the redundant flag and set the correct reset flag so CPQ is paused during reset Fixes: `8498a30e1b` ("RDMA/irdma: Register auxiliary driver and implement private channel OPs") Link: https://lore.kernel.org/r/20210916191222.824-2-shiraz.saleem@intel.com Reported-by: LiLiang <liali@redhat.com> Signed-off-by: Sindhu Devale <sindhu.devale@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-09-20 14:13:22 -03:00
Leon Romanovsky	514aee660d	RDMA: Globally allocate and release QP memory Convert QP object to follow IB/core general allocation scheme. That change allows us to make sure that restrack properly kref the memory. Link: https://lore.kernel.org/r/48e767124758aeecc433360ddd85eaa6325b34d9.1627040189.git.leonro@nvidia.com Reviewed-by: Gal Pressman <galpress@amazon.com> #efa Tested-by: Gal Pressman <galpress@amazon.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com> #rdma and core Tested-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Tested-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-08-03 13:44:27 -03:00
Shiraz Saleem	1f70075722	RDMA/irdma: Fix potential overflow expression in irdma_prm_get_pbles Coverity reports a signed 32-bit overflow on "1 << pprm->pble_shift" when used expression to compute bits_needed that expects 64bit, unsigned. Fix this by using the 1ULL in the left shift operator and convert mem_size to u64. Link: https://lore.kernel.org/r/20210625162329.1654-3-tatyana.e.nikolova@intel.com Reported-by: coverity-bot <keescook+coverity-bot@chromium.org> Addresses-Coverity-ID: 1505157 ("Integer handling issues") Fixes: `915cc7ac0f` ("RDMA/irdma: Add miscellaneous utility definitions") Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-06-25 14:08:30 -03:00
Shiraz Saleem	2db7b2eac7	RDMA/irdma: Store PBL info address a pointer type The level1 PBL info address is stored as u64. This requires casting through a uinptr_t before used as a pointer type. And this leads to sparse warning such as this when uinptr_t is missing: drivers/infiniband/hw/irdma/hw.c: In function 'irdma_destroy_virt_aeq': drivers/infiniband/hw/irdma/hw.c:579:23: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast] 579 \| dma_addr_t pg_arr = (dma_addr_t )aeq->palloc.level1.addr; This can be fixed using an intermediate uintptr_t, but rather it is better to fix the structure irdm_pble_info to store the address as u64* and the VA it is assigned in irdma_chunk as a void*. This greatly reduces the casting on this address. Fixes: `44d9e52977` ("RDMA/irdma: Implement device initialization definitions") Link: https://lore.kernel.org/r/20210609234924.938-1-shiraz.saleem@intel.com Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-06-10 09:39:27 -03:00
Shiraz Saleem	6246f1ccb9	RDMA/irdma: Use list_last_entry/list_first_entry Use list_last_entry and list_first_entry instead of using prev and next pointers. Link: https://lore.kernel.org/r/20210608211415.680-1-shiraz.saleem@intel.com Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-06-08 20:04:00 -03:00
Colin Ian King	205be5dc99	RDMA/irdma: Fix spelling mistake "Allocal" -> "Allocate" There is a spelling mistake in a literal string. Fix it. Link: https://lore.kernel.org/r/20210607113345.82206-1-colin.king@canonical.com Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-06-07 20:20:32 -03:00
Mustafa Ismail	915cc7ac0f	RDMA/irdma: Add miscellaneous utility definitions Add miscellaneous utility functions and headers. Link: https://lore.kernel.org/r/20210602205138.889-13-shiraz.saleem@intel.com Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-06-02 19:55:19 -03:00

24 Commits